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CHAPTER  10 


Graphical  Exploratory  Analysis  of 
Variance  Illustrated  on  a  Splitting 
of  the  Johnson  and  Tsao  Data1 

Eugene  G.  Johnson 
Educational  Testing  Service 

John  W.  Tukey 
Princeton  University 


INTRODUCTION 

Since  our  main  purpose  is  to  illustrate  how  a  combination  of 
graphic  display  and  simple  arithmetic  can  be  used  to  enhance  the 
effectiveness  of  Daniel's  half-normal  plots,  we  shall  focus  on  an 
analysis  of  a  "  x  4  x  7  data  set  provided  by  the  responses  of  person 
IB1  in  the  'methodological  experiment"  presented  by  (Palmer) 
Johnson  and  Tsao  (1944)  and  also  analyzed  by  Palmer  Johnson  (1949) 
and  by  Green  and  Tukey  (1960).  The  full  data  set  involves  8  persons. 
We  will  return  below  to  a  fuller  description  of  the  data  set  and  a 
discussion  of  how  it  might  be  analyzed,  and  elsewhere  to  a  discussion 
of  how  the  analysis  might  be  carried  further,  with  special  attention  to 
the  identification  and  sterilization  of  exotic  values. 

We  need  to  make  clear  our  overall  prejudices  and  purposes  —  a 
(well-founded)  prejudice  that  most  analyses  of  variance  with  3  or 
more  factors  are  used  for  exploration  rather  than  confirmation,  with  a 

1  Prepared  in  connection  with  research  at  Princeton  University  sponsored  by  the 

U.  S.  Army  Research  Office  (Durham). 
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^  clear  description  of  the  apparent  behavior  being  much  more 
important  than  formal  significance  tests,  and  a  clear  purpose  to  reach 
as  simple  and  complete  description  of  the  data  behavior,  in  the 
instance  before  us,  as  we  know  how,  taking  reasonable  account  of 
questions  of  multiplicity,  but  not  overemphasizing  precise 
significance.  We  want  a  description  of  the  appearance  of  our  data, 
even  though  more  data  might  not  confirm  that  appearance. 

Since  we  further  believe  that  good  techniques  come  from  the 
accretion  of  many  ideas,  not  just  from  a  single  brain  wave,  we  are  not 
dismayed  by  the  appearance  of  at  least  12  conceptual  ingredients  (3 
old,  1  due  to  Daniel).  Rather  we  wonder  where  the  13th  and  14th 
will  come  from.  To  tease  the  reader's  imagination,  we  list  the  12 
ingredients  so  far  at  hand  (the  later  ones  need  not  be  as  large  or 
important  as  the  earlier): 

1)  classical  analysis  of  variance 

2)  aggregation 

3)  half-normal  plotting 

4)  horizontalized  plotting 

5)  scission  into  bouquets  of  contrasts 

6)  pretrimming  by  nomination 

7)  post-trimming  by  election 

8)  nominated  bouquets 

9)  2nd  order  trimming  (super-election) 

10)  reformulating  a  response 

11)  rethinking  a  scission 

12)  refactoring  an  analysis 

We  believe  that  these  ingredients  can  be  used  in  any  factorial  analysis 
of  variance  —  and  in  many  others  of  different  form. 

The  basic  elements  underlying  all  this  are: 

(A)  basic  ANOVA  concepts  of  decomposition  —  of  separating  of 
each  number  into  parts,  each  part  coming  as  purely  as  we  can 
arrange  from  its  specified  "source," 

(B)  anticipation  of  revision  in  the  light  of  the  data,  not  only  of 
numbers  but  also  of  the  style  and  form  of  separation, 

(C)  use  of  long-term  insight  to  select  specifics  for  trial, 

(D)  use  of  pictures  to  see  what  may  need  special  treatment  or 
modification, 

(E)  use  of  arithmetic  to  conduct  modifications. 
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(F)  ultimately  a  combination  of  (a)  numerical  summaries, 
hopefully  depictable,  and  (b)  pictorially  apparent  absence  of  what 
else  might  plausibly  be  present. 

We  now  discuss  the  ingredients,  and  explain  their  concatenation 
and  mixing,  in  terms  of  the  single  2x4x7  data  set  mentioned 
above,  showing  how  they  lead  to  a  reasonably  compact  description  of 
the  56  numbers. 
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PART  A:  ANALYZING  IBl's  PERFORMANCE  IN  GRAMS 

Al.  Looking  at  the  Data  Overall 

The  2x4x7  data  set  for  person  IB1  (one  of  the  two  blind  males 
in  the  full  experiment)  involves  2  dates  (1,2),  four  rates  (50,  100,  150 
and  200  grams  per  30  seconds),  and  seven  (initial)  weights  (100,  150, 
200,  250,  300,  350  and  400  grams).  The  experimental  procedure 
involved  attaching  a  pail  by  a  lever  system  to  a  ring  on  the  subject's 
finger.  One  of  the  seven  initial  weights  was  placed  into  the  pail,  and 
then  water  was  allowed  to  flow  into  the  pail  at  one  of  four  constant 
rates  until  the  subject  reported  a  change  in  pull  on  the  finger.  The 
intended  response,  the  difference  limen  (D.L.),  was  measured  by  the 
amount  of  water  added  by  time  of  report.  Five  determinations  were 
made  for  each  of  the  28  rate-weight  combinations,  and  the  average  of 
these  values  was  used  as  the  response.  The  entire  experiment  was 
conducted,  for  each  person,  at  each  of  two  dates,  one  week  apart. 

The  full  experiment  consisted  of  8  persons,  two  persons  in  each 
cell  of  a  2  x  2  design  for  male  vs.  female  and  sighted  vs.  blind.  In 
their  analysis  of  the  complete  data  set.  Green  and  Tukey  noted  that 
person  IB1  had  a  pattern  of  response  that  was  considerably  different 
from  that  of  the  other  persons  (including  the  other  blind  male)  and 
designated  him  as  the  "eccentric  blind  man."  For  this  very  reason, 
we  have  selected  person  IB1  for  our  initial  (within  person)  analysis. 


TABLE  1.  Average  difference  limen 
blind. 

in  grams  for 

person 

IB1  - 

male. 

Rate 

(gm/30  sec) 

Date 

Initial  Weight  (Grams) 

100 

150 

200 

250 

300 

350 

400 

50 

1 

24.2 

25.3 

25.1 

17.6 

20.7 

19.4 

17.3 

2 

41.2 

29.8 

28.5 

23.8 

20.9 

17.8 

13.4 

100 

1 

48.1 

41.2 

31.4 

30.4 

39.9 

36.7 

35.5 

2 

59,1 

59.7 

48.7 

38.1 

30.7 

28.4 

27.2 

150 

1 

60.9 

52.0 

58.2 

60.6 

57.1 

57.9 

49.5 

2 

75.8 

79.9 

69.1 

64.4 

42.2 

53.1 

36.3 

200 

1 

69.9 

76.7 

82.4 

76.4 

71.4 

76.9 

79.6 

2 

148.3 

123.1 

73.5 

61.9 

77.8 

56.0 

53.2 
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As  a  first  step  in  the  analysis  of  the  performance  of  IB1  to  the 
experiment,  we  display  his  responses.  The  actual  values  appear  in 
Table  1;  a  graphical  display  of  the  responses  is  shown  in  Figure  1. 
Figure  1  consists  of  a  series  of  7  plots,  one  for  each  level  of  weight 
(indicated  on  the  horizontal  axis).  Each  of  the  seven  plots  shows  the 
relationship  between  the  response  (D.L.  in  grams)  and  rate  for  both 
dates.  The  responses  for  each  date  are  connected  by  a  broken  line. 
In  considering  Figure  1,  the  first  thing  that  strikes  the  eye  is  the 
strong  linear  relationship  between  D.L.  and  rate.  In  fact,  such  a 
relationship,  which  was  also  found  by  Johnson  and  Tsao  and  by 
Green  and  Tukey,  holds  for  each  person.  In  Part  C,  we  will  consider 
an  analysis  of  the  data  for  person  IB1  which  uses  a  different 
dependent  variable  (log(response  time))  and  which  produces  a 
particularly  simple  interpretation  of  the  relationships  between  level 
of  response  and  the  various  factors.  For  the  moment,  however,  we 
will  press  forward  with  an  analysis  of  person  IB1  with  response  in 
the  original  units  —  D.L.  in  grams. 


We i ghtt  100  150  ZOO  Z50  300  350  400 


Rotes  Within  Weight  and  Date  Are  Connected 
(Plot  Symbol  is  Oato) 

Figure  1.  Person  IBI:  male,  blind.  Average  difference  limen  (grams). 
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Crude  Classical  Analysis  of  Variance 

A  standard  analysis  of  variance  table  of  the  2x4x7  factorial 
design  for  person  IB1  with  difference  limens  in  grams  as  the 
dependent  variable  is  given  in  Table  2.  In  the  table,  we  have 
denoted  the  independent  variables  as  R  for  rate,  W  for  weight  and  D 
for  date.  In  determining  the  values  for  "F"  and  the  significance 
levels,  the  line  of  the  table  corresponding  to  the  three-factor 
interaction  DRW  was  (naively)  taken  as  measuring  appropriate  error 
for  each  of  the  other  lines  (extreme  model  1). 

Given  the  strong  linear  relationship  between  D.L.  and  rate  within 
each  date-by-weight  combination  noted  from  Figure  1,  it  is  not  at  all 
surprising  to  find  that  the  largest  mean  square  is  that  associated  with 
rate.  The  only  other  "significant"  lines  in  Table  2  are  the  weight 
main  effect  and  (somewhat  less  strongly)  the  date  x  weight  two-factor 
interaction. 


Crude  Aggregated  Analysis  of  Variance 

Using  the  sort  of  aggregation  proposed  by  Green  and  Tukey  (and 
described  in  section  A9),  the  result  would  be  as  in  Table  3. 

We  shall  consider  more  refined  analyses  starting  from  (analogs)  of 
these  two  tables.  The  results  are  similar  to  those  for  the  classical 
analysis,  although  the  2x7  date  and  weight  table  is  reassembled 
(from  D,  W  and  DW). 


Single  df's 

A  (traditional)  [aggregated]  analysis  would  now  proceed  to  pull 
apart  from  the  various  (significant)  [remaining]  lines  one  or  more 
single-degree-of-freedom  components,  comparing  the  magnitudes  of 
each  of  the  constituent  contrasts  with  that  of  (the  three-factor 
interaction  mean  square)  [an  appropriate  aggregated  denominator] 
taken  to  represent  error.  Rather  than  beginning  with  such  an 
approach,  we  will  cut  up  each  of  the  lines  of  the  (full)  [aggregated] 
analysis  of  variance  table  (including  the  DRW  line)  into  single 
degree-of-freedom  components  —  contrasts  —  and  compare  the  sizes 
(absolute  values)  of  these  contrast  components  graphically  by  a 
technique  related  to  Daniel's  half-normal  plots. 

In  such  an  analysis,  to  be  described  shortly,  we  do  not  assume  that 
any  prechosen  line  of  any  analysis  of  variance  table  is  necessarily 
solely  measuring  error.  Rather,  we  assume  that  any  of  the  selected 
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contrasts  which  combine  to  constitute  some  line  (i.e.,  a  main  effect,  a 
two-factor  interaction,  a  n-factor  interaction)  might  have  mean 
values  different  from  zero,  but  that  most  of  the  totality  of  them  will 
serve  for  error  estimates.  That  is,  we  assume  the  bulk  of  the 
(properly  defined)  single-degree-of-freedom  components  are  actually 
measuring  error  —  or  come  close  to  doing  so  —  but  we  do  not  a  priori 
specify  which  ones,  or  which  error. 

We  selected  our  2x4x7  subset  of  the  data  in  such  a  way  as  to 
avoid  major  complications  with  multiple  error  terms,  which  arise 
when  the  whole  data  set  is  considered.  Similar  techniques  will  apply 
to  factorial  data  sets  deserving  more  error  terms. 


TABLE  2.  Standard  analysis-of-variance  table  for  person  IB1  (dependent 
variable  is  difference  limen  in  grams). 


Source 

df 

MS 

DEN 

F 

Sig 

D 

1 

348 

149 

2.33 

not 

R 

3 

8514 

149 

57.06 

0.01% 

W 

6 

772 

149 

5.18 

0.5% 

DR 

3 

21 

149 

0.14 

not 

DW 

6 

545 

149 

3.65 

2.5% 

RW 

18 

74 

149 

0.50 

not 

DRW 

18 

149 

— 

— 

- 

TABLE  3.  Aggregated  analysis-of-variance  table  for  person  IB1. 


Label 

df 

MS 

DEN 

F 

Sig* 

Rate 

3 

8514 

105** 

81.10** 

0.01% 

Date  and  Weight 

13 

635 

6.05 

0.01% 

Residual 

39 

105 

— 

— 

*  Notice  that  (a)  significance 

levels 

of  F-values 

require  a  large. 

rather 

unspecified  multiplier  for  multiplicity  and  (b)  0.01%  is  the  most  extreme  level 
considered. 

**  Would  have  been  635  and  13.4,  respectively,  had  any  rate  interaction 
appeared  in  the  following  (date  and  weight)  aggregation. 
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A2.  Bouquets  of  Contrasts 

Since  we  are  going  to  focus  on  single  degrees  of  freedom,  we  need 
to  break  up  each  factor  into  a  bouquet  of  contrasts,  each  a  single 
degree  of  freedom.  We  expect  to  use  the  natural  combinations  (outer 
products)  of  these  one-factor  contrasts  to  also  break  up  the  two-factor 
and  three-factor  interactions  into  single-degree-of  ’•eedom  contrasts. 

Because  the  two  factors  to  be  broken  up  (date  is  already  a  single 
contrast)  are  scales  with  equispaced  versions  (levels),  it  is  perhaps 
natural  to  consider  the  classical  orthogonal  polynomials  as  a  possible 
first  choice  for  the  basic  bouquet.  Besides  these  contrasts  there  are 
other  types  of  orthogonal  contrasts  which,  depending  on 
circumstances,  may  have  greater  utility.  We  shall  return  to  some  of 
these  in  section  D2. 

If  the  versions  of  our  factors  had  been  only  ordered,  not 
measured,  we  might  have  followed  Abelson  and  Tukey  (1963)  in 
selecting  an  initial  contrast,  or  conceivably,  have  separated  the 
response  into  monotone  increasing  and  monotone  decreasing  parts. 
Unordered  versions  can  often  be  sensibly  partitioned,  although  we 
may  have  to  be  somewhat  arbitrary  in  some  or  all  of  our  choices  of 
contrasts. 

Various  bouquets  of  contrasts  have  their  place  in  the  analysis  of 
data.  We  will  use  a  number  of  different  bouquets  in  our  analysis  of 
our  2^4  x  7  data  set.  Given  a  collection  of  bouquets  of  single- 
degree-of-freedom  contrasts,  one  for  each  line  of  the  standard 
analysis  of  variance  table,  the  next  step  in  the  continued  analysis  of 
the  data  is  the  assessment  of  what  the  values  of  the  contrasts  are 
trying  to  tell  us  about  the  relationship  within  and  between  the 
various  factors.  This  assessment  will  be  done  graphically  via  a 
procedure  related  to  Daniel's  half-normal  plots. 


A3.  Horizontalizing  Plots 

The  classic  “half-normal  plot"  relates  the  sizes  (absolute  values)  of 
the  normalized  contrasts,  ordered  by  size,  with  typical  values  of 
order-statistics  of  the  half-Gaussian  distribution  by  plotting  (ordered) 
size  of  contrast  versus  typical  order-statistic.  The  corresponding 
natural  reference  is  a  line  through  the  origin,  whose  slope  corresponds 
to  an  estimate  of  the  underlying  scale  <r.  To  make  internal 
comparison  much  easier,  we  shall  instead  plot 


size  of  contrast 
typical  order  statistic 


versus 


(typical  order-statistic) 
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thus  making  the  natural  reference  a  horizontal  line,  whose  height 
corresponds  to  an  estimate  of  a. 


Display  Ratios 


While  classical  probability  plots  often  use  the  unit  deviates 

J  /  — 

corresponding  to  — — —  or  — - —  as  the  typical  order  statistics  for  a 
d+1  d 


sample  of  d  values,  we  shall  work  very  close  to  the  order-statistic 
medians  r  by^  using  deviates  corresponding  to 


3 


(3i  —  l)/(3d  + 1).  For  the  half-Gaussian  (the 


distribution  of  the  positive  square  root  of  any  chi-square  on  one 
degree  of  freedom),  this  means  using  the  half-Gaussian  working 
values,  the  x'th  such  (of  d)  being 


c(i:d)  -4)'1 2 


'A  +  Vi 


3i— 1 
3d+l 


- 


3i  +3d 
6d  +2 


where  $  is  the  unit  Gaussian  cumulative  distribution  function. 
Thus,  we  shall  plot  the 


display  ratio  — 

c(i:d) 

versus  the  typical  order  statistic  -  c{i:d)  where  |C(i:d)|  is  the  ith 
largest  size  of  our  contrasts.  Under  the  simple  (null)  model  that  the 
sizes  of  contrast  |C(l:d)|, |C(d:d)|  represent  a  set  of  order  statistics 
from  a  sample  of  size  d  from  a  half-Gaussian  distribution  with  scale  <r 
and  location  0,  each  of  the  d  display  ratios  provides  an  estimate  of  o. 

We  compute  the  display  ratios  separately  for  each  bouquet  of 
contrasts,  one  bouquet  for  each  line  of  a  conventional  analysis  of 
variance  table.  A  plot  of  display  ratio  versus  working  value  for  a 
particular  bouquet  shows: 

1)  the  general  level  of  variability,  hopefully  background 
noise,  captured  by  a  typical  defining  contrast  of  the 
bouquet  and  measured  by  a  horizontal  line,  and 

2)  the  relative  magnitude  of  the  various  sizes  of  contrast  in 
terms  of  the  general  level  for  that  bouquet  and  in  terms  of 
what  would  be  expected  under  a  simple  (null)  model. 


180 


GRAPHICAL  EXPLORATORY  ANALYSIS  OF  VARIANCE 


We  tend  to  focus  on  the  display  ratios  for  the  largest  sizes  of 
contrasts,  interpreting  relatively  large  display  ratios  as  indicating 
potentially  meaningful  contributions,  likely  to  be  worth  separate 
description. 

A  plot  of  display  ratio  versus  working  value  for  a  bouquet  will 
sometimes  produce  slightly  confusing  appearances,  when  granularity, 
arithmetic  errors,  or  other  causes  of  individual  exotic  values  keep  the 
contrasts  of  smallest  size  from  being  as  small  as  a  simple  model 
suggests  they  ought  to  be.  Thus,  relatively  high  values  of  the  display 
ratio  for  quite  small  working  values  should  often  be  ignored.  If 
considered,  they  should  usually  be  regarded  as  suggesting  isolated 
errors,  exoticities  or  granularities.  (We  turn  later  (elsewhere)  to 
looking  for  such  isolated  phenomena.)  A  general  downward  trend 
(to  the  right)  invites  similar  interpretation  and  treatment. 

Although  we  compute  the  display  ratios  separately  for  each 
bouquet,  ordinarily  we  will  overlay  the  plots  for  each  bouquet  of  a 
giver  type  (main  effects,  2-factor  interactions,  etc.)  on  the  same 
figure,  connecting  the  points  for  contrasts  in  each  bouquet  by  a 
broken  line.  This  allows  both  for  internal  comparison  of  the  sizes  of 
contrast  within  a  bouquet  and  comparison  with  the  sizes  of  contrasts 
of  the  other  bouquets  of  that  type.  By  using  the  same  vertical  scale 
for  all  of  the  plots,  we  can  also  compare  the  sizes  of  contrasts  across 
the  various  types.  The  latter  comparison  allows  the  assessment  of  the 
relative  importance  of  a  given  contrast  in  the  experiment  as  a  whole 
and  also,  by  comparing  the  general  level  of  one  bouquet  with  that  of 
the  others,  indicates  if  the  set  of  defining  contrasts  for  a  given  factor 
might  be  replaced  by  another  set  of  defining  contrasts  to  produce  a 
simpler  account  of  the  data.  Such  a  possibility  exists  if  the  general 
level  of  a  particular  bouquet,  particularly  a  main  effect  bouquet,  is 
above  the  levels  of  the  other  bouquets.  We  will  discuss  the  possible 
causes  of  this  in  Part  D. 


A4.  Display  Ratios  in  the  Example 

We  now  return  to  subject  IB1  and  apply  the  above  procedure, 
using  the  polynomial  contrasts.  The  result  is  shown  in  Figure  2. 
where  we  have  grouped  the  bouquets  into  three  sets,  plotting  the 
three  main-effect  bouquets  together  in  the  first  panel,  the  three  two- 
factor-interaction  bouquets  together  in  the  second,  and  the  three- 
factor  interactions  in  the  third  panel.  The  vertical  scale  for  the  three 
plots  is  the  same,  allowing  for  the  comparison  of  magnitude  of  the 
display  ratios  for  all  bouquets. 


Figure  2.  Person  IB1  —  D.L.in  grams.  Polynomial  contrasts. 
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The  points  for  contrasts  within  each  bouquet  are  connected  by  a 
broken  line,  and  the  largest  size  of  contrast  within  each  bouquet  is 
labeled.  Occasionally,  other  contrasts  of  large  size  will  be  labeled. 
The  notation  for  the  various  bouquets  is  as  in  the  initial  analysis  of 
variance  (Table  2);  the  order  of  polynomial  contrast  is  indicated  by  a 
number  following  the  bouquet  label.  For  example,  R  1  is  the  linear 
contrast  for  rate,  W2  is  the  quadratic  contrast  for  weight,  DW 12  is  the 
linear  (in  date)-by-quadratic  (in  weight)  two-factor  contrast,  and 
DRW  123  is  the  linear  (date)-by-quadratic  (rate)-by-cubic  (weight) 
three-factor  contrast.  (We  shall  return  to  the  various  numbers 
attached  to  the  bouquets  and  to  the  high  individual  points  in 
section  Bl.) 

The  first  thing  to  note  from  the  plots  is  that  the  bulk  of  the 
contrasts  have  display  ratios  at  a  level  of  about  10  grams.  In  light  of 
this  background  level,  we  have  to  recognize  the  contrasts  with  the 
largest  display  ratios  —  R  1  (linear  in  rate),  W 1  (linear  in  weight), 
DW11  (linear  in  date  by  linear  in  weight),  D 1  (linear  in  date),  and 
DRW111  (linear  by  linear  by  linear  three-factor  interaction),  in 
decreasing  order  —  as  worth  careful  consideration.  The  values  of 
these  display  ratios,  along  with  those  for  certain  of  the  next  largest 
contrasts  within  each  bouquet,  are  given  in  Table  4. 

In  view  of  the  proportional  relationship  between  D.L.  and  rate 
suggested  by  Figure  1,  it  is  not  surprising  that  R  1  is  the  strongest 
observed  relationship.  It  is  striking  that  the  five  largest  contrasts  in 


TABLE  4.  Values  of  display  ratio  for  the 
largest  contrasts  within  each 
bouquet  for  the  polynomial 
decomposition  of  IB1  (D.L.). 


Contrast 

Display  Ratio 

R  1 

124 

W1 

41 

DW  11 

34 

D  1 

28 

DRW  111 

16 

R  2 

16 

W2 

13 

all  others 

<  11 
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terms  of  display  ratio  are  solely  composed  of  linear  (straight-line) 
contrasts  and  their  products.  In  fact  the  linear  (or  linear  by  linear, 
etc.)  comparisons  have  the  largest  size  of  contrast  within  every 
bouquet.  We  will  return  to  this  in  the  next  section. 

For  the  three-factor-interaction  contrasts,  the  display  ratios  for  the 
contrasts  of  smallest  size  are  relatively  larger  than  might  be  expected. 
As  mentioned  previously,  this  phenomenon  might  be  indicating 
granularity  due  to  rounding  or  small  arithmetic  mistakes  (one  deviant 
observation  tends  to  contribute  similar  amounts  to  each  single 
degree-of-freedom  contrast). 

Another  phenomenon  to  notice  is  associated  with  the  R,  W  and 
DW  bouquets.  In  each  of  those  bouquets,  the  largest  size  of  contrast 
is  relatively  much  larger  than  the  remaining  sizes  of  contrast  within 
the  bouquet.  Notice,  in  each  of  those  three  bouquets,  that  the  next- 
to-largest  size  of  contrast  is  also  somewhat  high  in  terms  of  display 
ratio,  the  point  for  the  contrast  appearing  above  the  display  ratios  of 
all  the  remaining  contrasts  of  the  bouquet.  We  call  the  underlying 
phenomenon  "dragging  upward"  and  will  discuss  it  in  section  A8. 


A5.  The  Largest  of  the  Three-Factor  Contrasts 

The  right-most  point  for  DRW  requires  careful  discussion.  It 
seems  to  continue,  and  enhance,  a  general  upward  trend  for  the  ratios 
plotted  for  the  contrasts  in  this  bouquet.  However,  it  does  not  rise 
far  above  the  others.  Had  this,  for  instance,  been  a  linear-by- 
quadratic-by-quartic  three-factor  interaction,  or  some  other 
nondescript  contrast  among  the  1  x  3  x  6  -  18  in  this  bouquet,  we 
would  not  have  been  likely  to  attend  to  it.  It  is,  however,  the  linear- 
by-linear-by-linear  contrast,  a  priori  the  most  distinctive  and  most 
likely  (however  unlikely)  to  contain  something  meaningful.  To  have 
it  come  out  as  the  highest  absolute  value  of  all  3-factor  contrasts  is 
thus,  by  itself,  significant  at  1  / 18  —  5.56%,  so  that  it  needs  at  most  a 
little  extra  push  to  be  worth  our  honest  attention. 

Granting,  then,  that  it  may  include  a  real  effect,  how  should  we 
interpret  it?  It  is,  after  all,  a  three-factor  contrast  in  a  situation  where 
the  constituent  single-factor  contrasts  are  all  large.  For  the  present, 
then,  we  may  not  be  too  wrong  to  think  of  it  as  "probably  real,  but 
likely  to  be  a  spill-over  from  the  large  main  effects  because  of 
something  resembling  not-quite  satisfactory  expression  of  the 
response." 
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A6.  Pretrimmed  Bouquets;  Nomination 

Having  found  linear,  linear-by-linear  and  linear-by-linear-by- 
linear  contrasts  outstanding  in  our  example,  we  must  ask  ourselves: 
“In  such  a  situation,  where  a  few  contrasts  are  distinguished  above  all 
others  in  their  respective  bouquets,  'why  did  we  not  plan  to  treat 
them  separately  in  the  beginning  —  not  only  in  this  example  but  in 
general?"  No  good  answer  is  available.  So  let  us  trim  or’  bouquets 
—  and  even  pretend  that  we  pretrimmed  them  in  this  example  — 
moving  the  linear  contrast  out  of  each  single-factor  bouquet,  the 
linear-by-linear  (“linear-to-the-2")  contrast  out  of  each  two-factor 
bouquet,  and  the  linear-by-linear-by-linear  (“linear-to-the-3")  contrast 
out  of  the  three-factor  bouquet. 

Each  original  bouquet,  corresponding  to  a  line  in  the  analysis  of 
variance  table  and  consisting  of  d  contrasts,  has  now  been  partitioned 
into  two  bouquets: 

•  a  nominated  contrast  consisting  of  the  single  linear-to-the-/ 
contrast;  nominated  a  priori  as  likely  to  be  interesting 

•  a  trimmed  bouquet  consisting  of  the  d—1  remaining  contrasts, 
which  collectively  are  telling  us  about  the  contribution  of  the 
corresponding  line  of  the  analysis  of  variance  table  after 
eliminating  variation  describable  by  the  linear-to-the-;  contrast. 

In  our  example  this  creates  13  bouquets  from  the  original  7  (since  D 
is  already  linear-to-the-1,  there  is  no  trimmed  bouquet  for  D).  (For  a 
related  use  of  the  word  "nominated"  see  S.  C.  Pearce,  1953  or  1976.) 

By  nominating  D  1,  R  1,  W 1,  DR  11,  DW11,  RW 11,  and  DRW  111  as 
a  priori  interesting  contrasts  we  have  agreed  to  treat  each  of  these 
contrasts  not  as  one  of  the  d  members  of  the  original  effect  bouquet, 
but  rather  as  a  separate  thing  unto  itself.  As  such,  we  display  them 
using  the  working  value  c(l:l)  -  .674  to  compute  the  display  ratios 
for  each  of  the  nominated  contrasts. 

By  pretrimming  our  bouquets,  removing  the  nominated  contrast 
from  the  initial  bouquet  with  d  members,  and  producing  a  trimmed 
bouquet  with  d—1  members,  we  have  agreed  to  treat  the  contrasts  in 
the  trimmed  bouquet  as  collectively  separate  in  impact  from  the 
nominated  contrast.  As  such,  we  assess  the  magnitudes  of  the  sizes  of 
contrast  for  the  d—1  members  of  the  trimmed  bouquet  in  terms  of 
what  would  be  expected  from  a  sample  of  size  d—1  from  a  half- 
Gaussian  distribution  and  so  use  the  working  values 
c(l:d—  1),  ...,  c(d—l:d  —  l)  to  compute  the  display  ratios. 
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Results  in  the  Example;  Nominated  Contrasts 

Proceeding  in  this  manner  with  our  example  data,  we  produce 
Figure  3.  In  order  to  better  show  the  detail  for  the  trimmed 
bouquets,  we  have  truncated  the  vertical  axis  of  the  plots  at  50  and 
thus  do  not  show  directly  the  display  ratios  for  the  three  largest 
nominated  contrasts:  Rl,  W1  and  DW11,  each  of  which  should  be 
plotted  at  working  value  c  (1:1)  “  .674. 

On  examining  the  plots,  we  see  first  of  all  that  the  display  ratios 
of  all  the  nominated  contrasts,  with  the  exception  of  DR  11,  are 
notably  larger  than  the  display  ratios  of  any  of  the  contrasts 
remaining  in  the  trimmed  bouquets  (with  the  exception  of  the 
display  ratio  of  the  smallest  size  of  contrasts  in  the  trimmed  three- 
factor  bouquet,  corresponding  to  the  (small)  contrast  DRW  133  — 
possible  reasons  for  this  large  display  ratio  have  been  previously 
mentioned). 

The  display  ratios  for  the  7  nominated  contrasts  are  shown  in 
Table  5,  both  for  the  nominated  contrasts  plotted  in  Figure  3  and,  for 
comparison,  as  parts  of  the  original  effect  bouquets  plotted  in 
Figure  2. 

We  can  see  from  the  table  (and  from  the  plots)  how  much  the 
display  ratios  for  6  of  the  nominated  contrasts  have  each  increased 
when  treated  as  single-contrast  bouquets  over  the  display  ratios  for 
the  same  contrasts  when  treated  as  the  largest  member  of  one  of  the 
original  bouquets.  (The  median  display  ratio,  a  natural  background 
level,  has  fallen  from  10  to  9.)  The  ratio  of  nominated  display  ratio  to 
original  display  ratio  appears  as  the  last  column  of  Table  5.  We  also 
see  that,  in  terms  of  size  of  display  ratio,  the  ordering  of  the  display 
ratios  for  the  nominated  contrasts  is  essentially  the  same  as  before, 
with  the  exception  of  D 1,  which  has  moved  down  from  the  4th 
largest  to  the  6th  largest  (after  nomination). 

These  increases  in  the  values  of  the  display  ratios  reflect  the  sizes 
of  the  working  value  used  to  compute  the  display  ratios,  which  have 
decreased  from  c{d:d)  to  c (1:1).  The  ratio  of  c{d  :d)  to  c(l:l)  is  exactly 
equal  to  the  proportional  increase  in  display  ratio  due  to 
pretrimming.  We  can  see  from  the  table  that  the  ratio  of  increase 
grows  somewhat  as  the  size  d  of  the  original  bouquet  grows.  This 
growth  explains  the  increase  in  relative  importance  of  DRW  111  and 
RW11,  both  of  which  belonged  to  bouquets  with  18  members.  This 
also  helps  to  explain  why  DR  11,  whose  display  ratio  is  inflated  by  a 
factor  of  only  1.9,  remains  at  the  level  of  background  variability.  The 
plot  of  display  ratio  vs.  working  value  for  the  full  DR  bouquet  in 
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Person  IB1  —  D.L.  in  grams.  Polynomial  contrasts  —  pretrimmed  bouquets. 
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TABLE  5.  Display  ratios  and  reference  working  values  for  the  nominated 
contrasts  when  trimmed  out  and  when  left  in  the  original  effect 
bouquets. 


Nominated 

Contrast 

(In  Figure  2) 

Contrasts 

Display 

Ratio 

Working 
Value 
f (1:1) 

Bouquet 

Size 

d 

Display 

Ratio 

Working 

Value 

c(d:d) 

Ratio  Of 
Display 
Ratios 

R 1 

237 

(.674) 

3 

124 

(1.282) 

1.9 

W1 

98 

(.674) 

6 

41 

(1.620) 

2.4 

DW 11 

82 

(.674) 

6 

34 

(1.620) 

2.4 

DRW  111 

49 

(.674) 

18 

16 

(2.093) 

3.1 

RW 11 

33 

(.674) 

18 

11 

(2.093) 

3.1 

D 1 

28 

(.674) 

1 

28 

(.674) 

1.0 

DR  11 

9 

(.674) 

3 

5 

(1.282) 

1.9 

Median  Display 
Ratio  Over 

All  55  Contrasts 

9 

10 

.9 

Figure  2  shows  DR  11  at  the  end  of  a  general  decline;  DR  11  is 
relatively  smaller  than  might  be  expected  for  the  largest  order 
statistic  of  a  sample  of  size  3. 

Before  examining  the  trimmed  bouquets  in  Figure  3,  we  should 
reiterate  that  we  are,  in  this  section,  discussing  only  pretrimming.  By 
coincidence  (perhaps),  the  nominated  contrasts  in  our  example  all 
were  the  largest  representatives  of  their  respective  bouquets.  Since 
the  linear-to-the-;  contrasts  are  the  most  easily  interpretable  (and  in 
general  the  strongest  in  many  experiments),  we  should  have 
nominated  them  a  priori,  in  any  event.  The  increase  of  display  ratio 
on  nomination  will,  of  course,  be  less  when  the  nominated  contrast  is 
not  the  largest  in  its  bouquet. 


Other  Alternatives 

If  we  had  used  a  different  set  of  orthogonal  contrasts  other  than 
the  orthogonal  polynomials  to  define  a  bouquet,  we  could,  and 
probably  should,  still  pretrim  whenever  one  of  the  contrasts  is 
naturally  a  priori  distinguished  above  all  others.  It  is,  or  course,  also 
possible  to  post-trim  a  bouquet,  electing  for  removal  the  largest  (or 
largest  few)  contrasts  which  attract  our  attention  because  of  their 
relatively  large  display  ratios.  We  discuss  this  possibility  in 
section  B3. 
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A  Nominated  Bouquet? 

We  have  nominated  7  contrasts,  one  from  each  line  of  the  basic 
analysis.  So  far  we  have  treated  them  as  7  one-contrast  bouquets. 
But  why  should  we  not  treat  them  as  1  seven-contrast  bouquet,  the 
nominated  bouquet?  Table  6  shows  the  display  ratios  for  the 
nominated  bouquet;  the  first  panel  of  Figure  4  is  the  corresponding 
horizontalized  plot. 

Both  Table  6  and  Figure  4  show  HI  as  standing  out.  It  might  be 
reasonable  to  super-elect  (see  section  B3)  R 1  and  post-trim  the 
nominated  bouquet,  separating  R 1  into  its  own  1-contrast  bouquet 
and  leaving  the  remaining  contrasts  in  a  6-contrast  bouquet.  The 
result  of  this  is  shown  in  the  last  two  columns  of  Table  6  and  the 
second  panel  of  Figure  4.  We  observe  that,  whether  we  post-trim  or 
not,  the  display  ratios  of  the  other  6  nominated  contrasts  are 
surprisingly  similar.  We  will  return  to  this  point  below. 


A7.  Trimmed  Bouquets  in  the  Example 

We  now  return  to  the  trimmed  bouquets  to  consider  the  effect  of 
nomination  and  trimming  on  their  display  ratios.  A  consequence  of 
pretrimming  can  be  seen  by  comparing  the  plots  of  the  display  ratio 
versus  working  value  for  the  trimmed  and  original  bouquets. 
Considering,  for  example,  the  three-factor  bouquet,  we  can  see  from 


TABLE  6.  Display  ratios  for  the  nominated  contrasts  as  members  of  the 
seven-contrast  nominated  bouquet. 


7-Contrast  Bouquet 

After  Super-Electing  R  1 
and  Post-Trimming 

Contrast 

Display  Ratio 

Working  Value 

Display  Ratio 

Working  Value 

R  1 

94 

(1.691) 

237 

(■674) 

W1 

55 

(1.208) 

41 

(1.620) 

DW 11 

61 

(.908) 

49 

(1119) 

DRW  111 

49 

(.674) 

41 

(.804) 

RW 11 

47 

(.472) 

40 

(.555) 

D  1 

64 

(.288) 

56 

(.336) 

DR  11 

52 

(114) 

45 

(132) 
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Figure  2  that  the  display  ratio  of  the  second  largest  contrast,  DRW  123, 
appears  at  essentially  the  same  level  as  that  of  the  third  largest 
contrast  DRW  112.  Turning  to  Figure  3,  we  now  see  that  the  display 
ratio  of  DRW  123  (now  the  largest  size  of  contrast)  is  noticeably  lower 
(by  .7  units)  than  that  of  DRW  112  (now  the  second  largest  size  of 
contrast).  Looking  further,  comparing  the  plots  of  the  trimmed 
bouquets  with  the  original  bouquets,  we  can  see  a  general  tendency 
for  the  slopes  of  the  lines  between  adjacent  points  to  become  more 
negative. 

Both  this  and  the  reduction  in  the  general  level  of  the  display 
ratios  for  the  trimmed  bouquets  are  consequences  of  a  reduction  in 
the  "dragging  upward"  phenomena  to  be  discussed  in  the  next 
section. 


The  Changing  Typical  Size  of  Residuals 

Starting  to  act  as  if  we  had  nominated  all  linear-to-the-y  contrasts 
will  change  the  typical  sizes  of  the  display  ratios,  decreasing  such 
sizes  when  the  nominated  contrasts  are  large,  as  in  this  example,  and 


Figure  4.  Person  IB1.  D.L.  in  grams. 
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fluctuating  them  irregularly  when  the  nominated  contrasts  appear 
similar  in  size  to  the  others  in  their  bouquet.  The  most  interesting 
summary  sizes  of  the  display  ratios  seem  to  be: 


—  before  nomination  — 

median  for  all  55  initial  contrasts  —  9  9 

median  for  45  initial  non-main-effect  contrasts  —  9.3 

—  after  nomination  — 

median  for  all  55  contrasts  with  nomination  —  8.5 

median  for  all  48  unnominated  contrasts  —  8.1 

median  for  all  41  unnominated,  non-main-effect  contrasts  —  6.1 


Roughly  speaking,  nomination  reduced  the  median  size  of  display- 
ratio  by  13%.  (The  effect  wauld  have  been  larger  if  we  had  not  used 
the  median,  a  highly  resistant  summary.  It  is  here  neither  important 
nor  wholly  negligible.) 

A8.  Dragging  Upward 

In  our  example  (section  A4)  we  saw  that  retaining  the  largest 
contrast  in  each  bouquet  tended  to  make  the  second-largest  contrasts 
look  more  distinctive  than  need  be.  Do  we  expect  this  in  general? 

We  need  only  look  at  the  divisors,  at  the  order-statistic  typical 
values,  to  see  how  this  occurs.  For  d  -  7  (say  before  setting  the 
largest  apart)  and  d  —  6  (after)  we  have  the  values  in  Table  7.  On  a 
relative  basis,  the  divisor  for  what  was  initially  the  second  largest  size 
of  contrast  was  about  as  large  before  setting  aside  the  largest  one  as 
it  was  after  this.  It  is,  in  fact,  always  the  case  that  the  typical  value  of 
the  ith  order  statistic  of  a  sample  of  size  d,  c(i:d),  is  smaller  than 
c{i  :d—  1),  the  typical  value  of  the  ith  order  statistic  of  a  sample  of  one 
less.  The  relative  difference  is  most  pronounced  for  smaller  bouquet 
sizes  and  is  largest  for  the  next-to-largest  contrast  of  a  bouquet 
Table  8  shows  the  reference  values  for  the  next-to-largest  contrast 
within  an  original  bouquet  both  before  and  after  trimming  out  the 
largest  contrast,  for  the  bouquet  sizes  of  our  example.  The  table  also 
includes  their  ratios  c  {i  :d)  I  c  (i -.d  — 1),  and  the  median  value  of  these 
ratios. 
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If  we  have  pretrimmed  our  bouquets,  then  the  various  display 
ratios  for  the  trimmed  bouquets  will  be  reduced  from  what  they 
would  otherwise  have  been  in  the  original  bouquets  by  fractions 
indicated  by  Tables  7  and  8.  If  the  nomination  was  done  in  advance 
of  seeing  the  data  —  and,  also,  to  a  practical  approximation,  if  it  was 
done  before  any  detailed  analysis  of  the  data  was  made  —  then  the 
after-trimming  display  ratios  will  almost  surely  be  more  appropriate. 
(True  post-trimming  requires  somewhat  more  careful  thought.)  The 
larger  display  ratios  before  trimming  were  "dragged  upward"  by 
being  taken  as  less  exalted  order  statistics  than  they  deserved  to  be  — 
because  the  even  higher  contrast,  deserving  of  nomination,  unfairly 
seized  the  highest  position.  By  trimming,  we  have  prevented  this 
dragging  upward,  restoring  the  display  ratios  to  what  they  ought  to 
be. 


TABLE  7.  Illustration,  for  d  —  7,  of  dragging  upward  via  denominators. 


i 

d  -  7 

C(«:7) 

d  -6 
c(«:  6) 

Ratio 

Ratio  to 
Median 

7 

1.691 

- 

— 

- 

6 

1.208 

1.620 

* 

.745 

.882 

5 

.908 

1.119 

.812 

.961 

4 

.674 

.804 

.838 

.992 

3 

All 

.555 

.852 

1.008 

2 

.288 

.336 

.859 

1.017 

I 

.114 

.132 

.863 

median  —  .845 

1.021 

TABLE  8.  Dragging  upward  via  denominators  of  the  next-to-largest 
contrast. 


d 

c(d  —l:d) 

1 

■ts 

1 

2, 

o 

Ratio 

Median  Ratio* 

3 

.674 

1.058 

.631 

.661 

6 

1.119 

1.534 

.729 

.823 

18 

1.691 

2.070 

.817 

.937 

*  Median  is  of  c(i-.d)l  c(i:d- 1)  for  i  —  1,  ...,  d  —  1. 
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A  Possible  Initial  Plot 

We  could  try  to  reflect  this  dragging  upward  effect  in  an  initial 
plot,  at  the  cost  of  making  things  rather  “busy"  looking.  Figure  5 
shows  the  three-factor  sizes  of  contrast  with 

I  ■  plotted  as  0 

c{r.d) 

plotted  as  1 

and 

plotted  as  2 
c(i  :d  —2)  r 

with  the  three  for  each  /  connected  by  broken  lines.  The  vertical  axis 
has  been  truncated  at  20  to  show  the  main  detail.  We  see  a  consistent 


Work  i  r»g  V o  1  ue 

The  itn  size  of  contrast  •  i  th  i  n  bouquet*  of  16.  17.  16  or#  cornectec 
Plot  symbol  IS  th#  number  of  largest  CQf'troit*  set  aside. 


Figure  5.  Effect  of  undragging  for  the  three-factor  contrasts.  IB1  —  D.L  in 
grains  —  polynomial  contrasts. 


PART  A:  ANALYZING  IBl's  PERFORMANCE  IN  GRAMS  193 


decrease  in  the  value  of  the  display  ratio  as  we  move  from  a  bouquet 
of  all  18  to  a  bouquet  of  the  smallest  17  to  a  bouquet  of  the  smallest 
16  sizes  of  contrast.  The  horizontal  spread  in  the  plot  of  each  size  of 
contrast,  indicated  by  the  pair  of  lines  connecting  a  0,  1  and  2,  shows 
the  relative  change  in  value  of  the  reference  value  as  the  bouquet  size 
is  reduced.  The  change  in  working  value  versus  bouquet  size  is  the 
most  pronounced  for  the  largest  size  of  contrast,  as  is  the  decline  in 
the  value  of  the  display  ratio. 

Since  we  must  expect  some  such  effect,  even  if  the  largest  contrast 
did  not  deserve  to  be  nominated,  it  is  not  easy  to  argue  cogently  from 
such  a  plot.  We  shall  not  pursue  its  possible  use  further. 


A9.  The  Effect  of  Aggregation 

Sections  A2  to  A8  have  been  concerned  with  either  (a)  choosing  a 
general  approach  or  (b)  illustrating  the  consequence  of  using  the 
approach  starting  from  Table  2  or  an  analog.  It  is  time  to  ask  what 
changes  we  expect  when  we  start  from  an  analog  of  Table  3,  if  we 
aggregate  before  going  to  the  individual  contrasts. 


Opening  Analysis 

If  we  are  to  pretrim,  we  should  choose  the  nominees  before  any 
detailed  analysis  of  the  data.  So  it  is  natural  for  us  to  begin  with  a 
conventional  post-nomination  analysis-of-variance  table,  involving  13 
lines.  Table  9  sets  out  the  numbers.  We  will  use  the  notation  "X  (n)" 
for  the  nominated  portion  of  the  bouquet  labeled  by  "X". 

A  striking  thing  to  note  from  the  post-nomination  analysis-of- 
variance  table  is  that  the  value  of  the  mean  square  for  DRWtrim,  the 
three-factor  interaction  after  removing  the  linear-to-the-3  component, 
is  larger  than  the  values  of  the  mean  squares  for  any  of  the  other 
"trimmed"  lines  in  the  table. 


The  Notion  of  "Above" 

In  a  general  context,  one  line  of  an  analysis-of-variance  table  is 
"above"  another  line  if  variability  in  the  "lower"  line  inevitably 
penetrates  into  the  "upper"  line,  a  situation  often  formalized  (usually 
satisfactorily)  as:  "the  expected  mean  square  of  the  former  ("upper") 
line  contains  all  the  terms  in  the  expected  mean  square  of  the  other 
("lower")  line." 
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In  doing  aggregation  after  nomination,  we  need  to  be  somewhat 
careful  in  defining  what  is  above  what.  We  will  take  the  view  here 
that  "X(n)"  is  above  "Xtrim"  and  also  above  anything  that  "Xtrim"  is 
above  (using  the  conventional  definition  to  decide  the  partial 
orderings  for  the  "trim"  lines).  Whether  any  nominated  contrast  can 
be  above  any  other  nominated  contrast  is  a  consequence  of  the 
particular  sets  of  orthogonal  contrasts  used  for  the  various  bouquets. 
In  our  particular  situation,  where  we  are  using  sets  of  orthogonal 
polynomials  (and  their  outer  products)  to  define  the  various  bouquets. 
Interpretation  One  holds  that  no  nominated  contrast  can  be  above 
any  other  nominated  contrast.  Thus,  for  example,  R  1  is  not  above 
RW11  because  R1  contains  the  mean  of  the  W  effect  while  RW11 
contains  the  slope  (but  not  the  mean). 

The  situation  is  perhaps  easier  to  understand  if  we  use  a  less 
familiar  notation.  Write  x0  if  we  have  taken  a  mean  over  x,  and  X[  if 
we  have  taken  a  slope  over  x,  and  X  or  X,  for  having  done  neither, 
without  or  with  trimming.  Then 

R1  —  d0  rx  w0 


TABLE  9.  Analysis-of-variance  table  after  nomination  and  before 
aggregation.  (Labels  for  all  nominees  (i.e.,  linear,  linear-by* 
linear,  etc.)  are  marked  (n).) 


Label 

df 

MS 

DEN 

F 

Sig 

R(n) 

1 

25426 

94 

270.5 

O.Ol'-c 

D  (n) 

1 

348 

94 

3.70 

lOSt 

W(n) 

1 

4338 

94 

46.1 

O.Ol'-c 

DR  (n) 

1 

36 

94 

.38 

not 

RW(n) 

1 

492 

94 

5.23 

5?c 

DW(n) 

1 

3041 

94 

32.4 

0.01  <-<■ 

DRW{n) 

1 

1089 

94 

11.6 

0.5^ 

Rtrim 

2 

58 

94 

.62 

not 

Wtrim 

5 

59 

94 

.63 

not 

DRtrim 

2 

14 

94 

.15 

not 

R  Wtrim 

17 

49 

94 

.52 

not 

DWtrim 

5 

46 

94 

.49 

not 

DRWtrim 

17 

94 

- 

(Total) 

(55) 

689 
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RW11  — *  d0  rx  w i 


while 


R  —  d0  R  w0 
RW  —*  d0  R  W 


or 


R,  -  d0  R,  w0 


R,  Wt  -  d0  R,  W, 


so  that  the  3  basic  questions  are:  Is  w0  above  w;,?  (No,  but!)  Is  w0 
above  W?  (Yes!)  Is  w0  above  IV,?  (Yes!)  Where  the  parenthesized 
answers  are  for  Interpretation  One. 

The  issue  is  that  a  real  (non-zero)  slope  need  not  —  but  is  very 
likely  to  —  imply  a  real  (non-zero)  mean.  If  the  exact  location  of  the 
mean  for  a  factor  is  not  any  meaningful  value,  then  the  likelihood  of 
a  mean-free  slope  is  small,  so  we  may  want  to  move  away  from 
Interpretation  One. 

It  is  far  from  clear  when  we  ought  to  move  all  the  way  from 
Interpretation  One  to  Interpretation  Two  and  say  "x0  is  over  x1".  For 
the  present,  we  recommend,  in  such  circumstances,  accepting 
Interpretation  Two  as  a  possible  alternative,  not  an  exclusive  choice. 
One  reason  for  this  caution  is  the  absence  of  any  standardized  way 
for  to  contribute  to  x0  that  is  at  all  analogous  to  the  standard 
contribution 


a2 

number  of  terms 


of  X  to  x0. 


Rules  for  Aggregation 

Aggregation  (as  detailed  by  Green  and  Tukey)  is  the  combination 
of  lines  in  the  analysis-of-variance  table  according  to  the  values  of 
their  mean  squares,  using  a  rule  of  thumb  (in  philosophical  contrast 
with  significance  testing,  where  non-significance  is  followed  by 
pooling).  The  procedure  is  to  start  with  the  lowest  remaining  line  (in 
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terms  of  "above"  with  ties  broken  by  value  of  mean  square)  and 
aggregate  any  other  line  with  it  which 

a)  is  "above"  the  basic  line  and  has  mean  square  less  than 
twice  that  of  the  basic  line; 

b)  is  NOT  "above"  any  other  line  which  does  not  satisfy  (a). 


A10.  Aggregation  in  the  Example 

We  start  the  aggregation  with  DRWtrim,  the  lowest  of  the  low,  as 
we  should.  Since  the  line  for  each  of  the  trimmed  bouquets  is  above 
DRWtrim  and  since  each  of  the  mean  squares  is  less  than  (twice)  that 
of  DRWtrim,  the  entire  set  of  trimmed  bouquets  will  be  aggregated 
together.  Additionally,  since  the  nominated  contrast  DR  11  is  above 
DRtrim,  it  is  also  above  DRWtrim;  and  since  the  mean  square  of  DR  11 
is  also  less  than  (twice)  that  of  DRWtrim,  DR  11  is  also  aggregated  in 
with  the  trimmed  bouquets.  No  other  nominated  contrasts  have 
mean  squares  less  than  twice  that  of  DRWtrim,  and  so  this  step  of 
aggregation  ceases.  We  will  identify  the  aggregated  collection  of  all 
trimmed  bouquets  and  DR  11  as  "residual". 

No  other  aggregations  are  possible,  and  so  we  are  led  to  an 
aggregated  analysis-of-variance  table  with  7  lines:  6  lines 
corresponding  to  the  6  largest  nominated  contrasts  and  a  seventh  line 
called  residual  with  49  degrees-of-freedom. 


TABLE  10.  Display  ratios  for  the  six  largest  nominated  contrasts  as 
members  of  6-contrast  and  5-contrast  bouquets. 


6-Contrast  Bouquet 

After  Super-Electing  R  1 
and  Post-Trimming 

Contrast 

Display  Ratio 

Working  Value 

Display  Ratio 

Working  Value 

R  1 

98 

(1.620) 

237 

(  674) 

W1 

59 

(1119) 

43 

(1.534) 

DW11 

68 

(.804) 

55 

(1.009) 

DRW  111 

59 

(.555) 

49 

(.674) 

RW11 

65 

(.336) 

55 

(.402) 

D  1 

141 

(.132) 

119 

(.157) 
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We  can  choose  to  treat  the  6  remaining  nominated  contrasts  as  a 
single  bouquet  of  6.  In  this  case,  we  form  the  display  ratios  shown  in 
the  first  two  columns  of  Table  10.  If  we  were  "splitters,"  we  might 
make  a  separate  bouquet  of  R  1,  leaving  the  other  5  nominees  in  a 
single  bouquet,  which  we  will  call  "middle  5"  (since  the  largest  and 
smallest  of  the  original  7  have  been  removed).  The  last  two  columns 
of  Table  10  show  the  resulting  display  ratios. 

The  striking  thing  in  Table  10,  when  compared  with  Table  6,  is 
the  proportionally  large  increase  in  the  display  ratio  for  Dl.  The 
suggestion,  in  view  of  the  relative  constancy  of  the  display  ratio  of 
the  next  4  contrasts,  is  that  the  size  of  the  date  main  effect  is  larger 
than  might  be  expected  for  the  smallest  of  the  unaggregated 
contrasts,  possibly  because  part  of  one  of  the  dates  was  "different". 

The  analysis-of-variance  table  after  aggregation,  collecting  the  6 
largest  nominated  contrasts  into  a  bouquet  and  then  electing 
(separating)  out  Rl,  has  three  lines,  given  in  Table  11.  The 
horizontalized  plots  for  the  three  bouquets  corresponding  to  Table  11 
are  shown  in  Figure  6.  The  essential  difference  between  the  first 
panel  of  Figure  6  and  the  second  panel  of  Figure  4  is  that  we  have 
now  put  the  smallest  nominated  contrast,  DR  11,  into  the  residual 
before  constructing  the  nominated  bouquet.  (The  size  of  contrast  for 
DR  1 1  is,  in  fact,  exactly  the  median  size  of  contrast  of  the  49 
members  of  the  residual  poly-bouquet.)  The  main  result  of  this 
exclusion  of  DR  11  from  the  nominated  bouquet  is  to  inflate  the  size 
of  the  display  ratio  of  the  now-smallest  member,  D 1. 


TABLE  11.  Analysis-of-variance  table  after  nominating,  aggregating, 
collecting  all  remaining  nominated  contrasts  into  a  bouquet 
and  electing  out  the  largest. 


Label 

df 

MS 

DEN 

F 

Sig* 

R(n) 

1 

25426 

64 

397 

0.01% 

middle  5 

5 

1862 

64 

29 

0.01% 

Residual 

49 

64 

Total 

(55) 

689 

*  Notice  that  (a)  significance  level  of  F-values  requires  a  large,  rather 
unspecified  multiplier  for  multiplicity  and  (b)  0.01%  is  the  most  extreme  level 
considered. 
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Turning  to  the  second  panel  of  Figure  6,  the  horizontalized  plot  of 
the  residual  bouquet  of  49,  we  see  that  our  aggregation  has 
eliminated  the  high  values  of  display  ratios  for  the  smallest  of  the 
three-factor  contrasts  that  we  have  become  used  to  seeing.  Instead, 
the  smallest  four  contrasts  within  the  residual  bouquet  (in  order: 
DW13,  RW33,  RW  24  and  W6)  have  smaller  display  ratios  than  might 
be  expected.  This  seems  unlikely  to  mean  anything.  (The  smallest 
three-factor  contrasts,  DRW  133  and  DRW116,  are  now  the  8th  and 
11th  smallest  contrasts  of  the  49  and  are  in  the  bump  seen  on  the 
left-hand  side  of  the  plot).  The  largest  three  contrasts  are  the  three- 
factor  contrasts  DRW123,  DRW  112,  DRW122,  in  decreasing  order, 
each  of  which  has  a  display  ratio  very  slightly  smaller  than  might  be 
expected.  In  general,  the  display  ratios  of  the  contrasts  in  the 
residual  bouquet  are  almost  precisely  what  we  would  expect  for 
random  Gaussian  noise  with  homogeneous  variability. 


Results  after  Aggregation 

At  the  end  of  our  aggregated  analysis  we  have  come  to  the 
following  overall  phenomenological  picture: 


Norn i noted  Contrasts  as  a  Bouquet 
R1  Super -e 1 ec ted 
OR  11  Exc 1 uded 


Aggregated  Cont-asts  os  a  Ecuq_e*. 
Includes  a] )  Trimmed  &  CR  1 1 


Work i ng  Vo  1 Ub 


Work  j  n  g  V  a i wO 


Figure  6.  Person  IB1  —  D.L.  (grams).  Result  of  aggregation  and  election. 
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•  large  linear  rate  effect 

•  moderate  linear-to-the-/  contrasts  for  each  of  5  other  lines 

•  perhaps  a  few  small  exotics,  calculation  errors,  etc.  that  remain 
unflagged 

•  a  featureless  "error  body" 

This  is,  of  course,  except  for  the  small  exotics,  just  what  we  saw  when 
we  didn't  begin  by  aggregating,  an  encouraging  agreement.  All  these 
choices  of  analysis  can,  and  often  should,  lead  to  the  same  report  (see 
section  B5). 


And  under  Interpretation  Two? 

Almost  as  an  aside,  we  note  that  if  we  had  considered  the  various 
nominated  contrasts  to  be  above  each  other,  as  might  seem  reasonable 
after  aggregation  (i.e.,  R1  is  above  RW11,  DR  11  and  DRW  111  but  not 
above  DW11),  then  the  resulting  aggregation  would  produce  4  lines 
rather  than  3,  namely: 

.  R1 

•  D 1,  DW 11  and  W 1  (linear  date  and  weight  involving  a  mean  on 

R) 

•  DR  11,  RW11  and  DRW  11  (linear  date  and  weight  involving  a 
slope  on  R) 

•  all  trimmed  bouquets 

This,  too,  gives  an  analysis  worth  looking  at  and  thinking  about. 


PART  B:  DATA-GUIDED  TRIMMING  OF  BOUQUETS 

Bl.  Scales  and  Ratios-to-Scale 

We  are  now  used  to  plotting  display  ratios  of  the  form 

size  of  contrast 
typical  order  statistic 

in  various  ways.  The  general  picture  is  that  most  ratios  are 
estimating  a  residual  variability  (possibly  differing  from  one  bouquet 
to  another),  but  that  some,  hopefully  a  few,  are  trying  to  reveal 
consistent  contributions  of  some  sort.  To  assess  the  size  of  the 
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residual  variability,  it  is  thus  natural  to  turn  to  the  median  (or 
conceivably  the  midmean)  of  these  display  ratios.  It  is  this  median 
that  is  shown  as  “scale"  next  to  the  bouquet  labels  in  the 
horizontalized  plots  (Figures  2,  3,  4,  6).  Because  of  the  “dragging 
upward"  phenomena  we  are  not  surprised  to  see  this  scale  decrease 
somewhat  as  we  trim  the  bouquets,  removing  opportunities  for 
dragging  upward. 


Ratio-to-Scale 

We  have  been  judging  the  sizes  of  the  display  ratios  both 
externally  —  looking  across  bouquets  —  and  internally  —  within 
either  the  entire  or  a  trimmed  bouquet.  For  internal  comparison,  it  is 
convenient  to  have  numbers,  and  the  natural  number  to  look  at  is  the 


ratio-to-scale 


display  ratio 
median  display  ratio 


(In  other  contexts,  but  not  here,  we  may  want  to  refer  to  this  as  an 
assassination  or  sterilizability  ratio.)  It  is  these  ratios  that  are  attached 
(in  parentheses)  to  individual  high  points  in  the  horizontalized  plots 
(Figures  2,  3,  4,  6). 


B2.  Null  Behavior  of  "Ratio-to-Scale" 

It  is  helpful  to  know  how  large  values  of  ratio-to-scale  we  are 
likely  to  see  in  a  null  situation,  particularly  how  large  a  value  we  are 
likely  to  see  for  that  contrast  of  largest  size  (in  the  bouquet  at  hand). 
The  distribution  of 

_ display  ratio  (for  largest-size  contrast) _ 

median  display  ratio  (for  all  contrasts  in  the  bouquet) 

is  easily  simulated,  starting  with  a  sample  of  d  "sizes"  from  a  half- 
Gaussian.  The  resulting  %  points  are  given  in  Table  12. 


Simulation  Details 

Each  row  of  Table  12,  corresponding  to  a  bouquet  of  size  d,  was 
computed  from  the  empirical  distribution  of  the  ratio-to-scale  for  the 
largest  order  statistic  from  a  sample  of  size  d  from  a  half-Gaussian 
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distribution.  This  empirical  distribution  was  based  on  2048  replicates. 
The  half-Gaussian  random  variates  were  generated  using  the  random 
normal  generator  of  Kinderman  and  Monahan  (1976),  that  generator 
using  the  McGill  universal  uniform  generator  (see  Chambers,  1977),  a 
combination  of  a  32-bit  congruential  with  an  independent  32-bit 
shift-register  generator. 


Simulation  Error 

To  obtain  an  estimate  of  the  variability  of  these  simulated  percent 
points,  a  method  akin  to  balanced  repeated  replications  (the  multi¬ 
halving  jackknife)  was  used.  Based  on  the  parity  of  the  ith  digit  in 
the  binary  representation  of  the  replication  number  (1-2048,  in  the 
order  of  generation),  the  collection  of  2048  values  of  ratio-to-scale  can 
be  divided  into  11  pairs  of  mutually  orthogonal,  interpenetrating 


TABLE  12.  Percent  points  from  the  distribution  of  the  ratio-to-scale  of  the 
largest  size  of  contrast  for  a  sample  of  size  d  from  the  half- 
Caussian  (from  simulations  of  2048  replicates  each  —  standard 
errors  in  parentheses). 


Sample  Size  Probability  of  Larger  Value 


d 

20% 

10% 

5% 

1% 

0.5% 

2 

1.40 

(02) 

1.67 

(01) 

1.83 

(01) 

1.95 

(01) 

1.98 

(01) 

3 

1.22 

(03) 

1.75 

(06) 

2.48 

(15) 

5.25 

(67) 

6.71 

(10) 

4 

1.30 

(03) 

1.65 

(.04) 

2.12 

(08) 

3.86 

(32) 

4.80 

(40) 

5 

1.30 

(03) 

1.70 

(05) 

2.14 

(06) 

3.86 

(25) 

4.61 

(.43) 

6 

1.30 

(01) 

1.60 

(04) 

1.95 

(04) 

3.22 

(18) 

3.90 

(27) 

7 

1.28 

(02) 

1.56 

(.03) 

1.91 

(06) 

2.98 

(12) 

3.70 

(30) 

8 

1.28 

(.02) 

1.53 

(02) 

1.80 

(05) 

2.64 

(08) 

3.19 

(.24) 

9 

1.24 

(01) 

1.53 

(.02) 

1.80 

(.04) 

2.58 

(11) 

301 

(17) 

10 

1.27 

(.03) 

1.51 

(03) 

1.78 

(04) 

2.61 

(09) 

2.88 

(10) 

15 

1.24 

(.02) 

1.46 

(02) 

1.65 

(.02) 

2.27 

(ID 

2.54 

(13) 

20 

1.22 

(01) 

1.39 

(01) 

1.55 

(.02) 

2.01 

(03) 

2.17 

(03) 

30 

1.18 

(01) 

1.30 

(02) 

1.44 

(.02) 

1.74 

(04) 

1.88 

(05) 

approx. 

(ford  >  2)  1.20+ —  1.30+ 1.38+ 1.42+ 1.43+ -y- 

dud  d  d 
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half-samples.  From  each  pair  of  half-samples,  two  estimates  (one  for 
each  half-sample)  of  the  percentage  points  can  be  obtained,  and  an 
estimate  of  the  variance  of  the  (full-sample)  percent  point,  worth 
perhaps  1  df,  comes  from  V*  the  squared  difference  of  the  half-sample 
estimates.  The  standard  errors,  shown  in  parentheses  in  Table  12,  are 
the  square  roots  of  the  average  of  the  11  separate  variance  estimates 
and  each  is  worth  (optimistically)  perhaps  11  df. 


d  —  2  is  Special 

The  anomalous  appearance  of  the  percent  points  for  d  —  2,  relative 
to  the  general  pattern  exhibited  for  the  larger  bouquet  sizes,  is  due  to 
the  special  constraint  placed  on  the  maximum  size  of  the  ratio-to- 
scale  for  a  sample  of  size  2.  Since  we  have  defined  the  denominator 
of  the  ratio-to-scale  as  the  median  of  the  two  display  ratios,  the  values 
of  the  ratio-to-scale  for  a  sample  of  size  2  are  bounded  above  by  2. 


Some  Approximations 

Approximations  for  the  values  of  these  percent  points,  valid  for 
d  >  2,  are  given  at  the  bottom  of  Table  12.  These  approximations 
were  derived  by  fitting  a  linear  dependence  of  the  percent  point  on 
1  Id,  a  column  at  a  time,  by  a  simple  resistant  regression.  To  preserve 
monotonicity  between  columns  for  large  values  of  d,  the  last  two 
approximations  were  modified  by  -.02  and  +.02,  respectively.  The 
approximations  have  relative  errors  of  less  than  8%  throughout  the 
pertinent  (d  >  3)  entries  of  the  table. 

In  terms  of  the  estimated  standard  errors,  30  of  the  55 
approximations  were  within  1  standard  error  of  the  simulated 
percentage  point,  and  44  of  the  55  approximations  were  within  2 
standard  errors.  (The  null  comparison  calls  for  36  and  51. 
respectively.)  In  these  terms,  the  approximations  were  relatively 
better  for  the  columns  for  5%,  1%  and  0.5%  probabilities  of  a  larger 
value  (P),  where  31  of  the  33  approximations  were  within  two 
standard  errors. 

The  maximum  absolute  estimated  error  for  the  approximations  is. 
by  column,  .10  for  P  —  20%,  .12  for  P  —  10%,  .11  for  P  —  5%,  .30  for 
P  —  1%  and  .36  for  P  —  0.5%. 
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B3.  Post-Trimmed  Bouquets;  Election 

The  value  of  the  ratio-to-scale  of  the  largest  size  of  contrast  within 
a  bouquet  (which  may  be  either  already  pretrimmed  or  an  original 
bouquet)  provides  us  with  a  criterion  for  use  in  assessing  the  amount 
of  attention  that  should  be  paid  to  that  particular  contrast  in  terms  of 
describing  the  data.  The  largest  size  of  contrast  within  the  bouquet 
can  be  considered  as  indicating  an  interesting,  potentially  credible, 
component  of  the  total  information  collectively  imparted  by  the 
bouquet  if  the  value  of  its  ratio-to-scale  is  larger  than  some  selected 
threshold  value,  where  the  threshold  value  can  be  selected  in  view  of 
our  Table  12  and  probably  should  depend  on  the  number  of  contrasts 
in  the  bouquet. 

Having  found  such  a  display  ratio,  we  can  elect  it  for  special 
attention  and  post-trim  the  bouquet,  creating  two  new  bouquets:  one 
consisting  solely  of  the  elected  contrast  —  designated  as  potentially 
interesting  —  and  the  other,  post-trimmed  bouquet  consisting  of  the 
remainder.  We  then  proceed  as  we  did  for  nomination  and  pre¬ 
trimming,  recomputing  the  display  ratios,  standing  ready  to  assess  the 
relative  importance  of  the  elected  contrast  in  the  experiment  as  a 
whole  in  terms  of  its  relative  standing  in  terms  of  display  ratio. 

We  can,  and  may  need  to,  repeat  the  election  process  before  doing 
anything  else,  comparing  the  ratio-to-scale  of  the  (now)  largest  size  of 
contrast  within  the  post-trimmed  bouquet  with  an  appropriate 
threshold  value  and  proceeding  as  above  if  the  threshold  value  is 
exceeded.  (Experience  will  show,  we  believe,  to  what  extent  such 
recursive  calculation  is  wise.)  At  some  stage,  which  might  very  well 
be  the  assessment  of  the  largest  size  of  contrast  within  the  initial  (not 
post-trimmed)  bouquet,  the  ratio-to-scale  of  the  largest  size  of  contrast 
within  the  current  bouquet  will  not  exceed  the  threshold  value.  The 
interpretation  is  that  this  contrast,  and  hence  all  remaining  contrasts 
of  smaller  size,  does  not  appear  individually  to  be  capturing  a 
significant  amount  of  the  information  embodied  in  the  bouquet 
beyond  that  expected  by  the  simple  (null)  half-Gaussian  model.  At 
this  stage,  post-trimming  certainly  ceases. 

We  will  end  up  splitting  the  initial  bouquet  into  two  sets  of 
contrasts:  the  first  set,  which  may  be  empty,  consists  of  the  elected 
contrasts,  the  largest  in  the  initial  bouquet,  each  of  which 
individually  appears  to  account  for  an  important  amount  of  the 
information  embodied  in  the  bouquet.  The  second  set  is  the  post- 
trimmed  bouquet  and  consists  of  the  remaining  (smaller)  contrasts, 
which  are  deemed  not  to  be  separately  providing  significant 
information.  We  return  later  to  the  possibility  of  extracting 
additional  information  from  the  post-trimmed  bouquet. 


204  GRAPHICAL  EXPLORATORY  ANALYSIS  OF  VARIANCE 
Treating  the  Elected  Contrasts 

The  elected  contrasts  can  be  treated  either  individually,  source 
bouquet  by  source  bouquet,  or  as  members  of  a  single  bouquet,  the 
elected  bouquet,  analogously  to  our  treatment  of  the  nominated 
contrasts.  They  could  even  be  combined  together  with  the 
nominated  contrasts  into  a  single  bouquet,  the  nominated-plus- 
elected  bouquet.  In  considering  the  elected  contrasts,  some  attention 
should  be  paid  to  issues  of  multiplicity. 

We  can  apply  the  post-trimming  procedure  to  the  nominated 
bouquet  itself  (as  well  as  to  the  elected  bouquet  and  the  nominated- 
plus-elected  bouquet).  Contrasts  flagged  as  too  large  in  terms  of  their 
ratio-to-scale  within  such  a  bouquet  will  be  referred  to  as  being 
super-elected,  since  they  have  distinguished  themselves  above  the 
other  contrasts,  already  distinguished  by  nomination  or  election. 


B4.  Election  (Post-Trimming)  in  the  Example 

Although  we  have  advocated  pretrimming  (nominating)  the  Dl, 
R  1,  Wl,  DR  11,  DW 11,  RW11  and  DR  Will  linear-to-the-/  contrasts  as 
a  general  maxim,  it  is  interesting  to  consider  what  would  have  been 
the  effect  if  instead  we  had  retained  the  full  bouquets  and  elected 
large  contrasts  exceeding  some  threshold  value(s),  probably  less  than 
or  equal  to  2.5.  Returning  to  Figure  2,  we  have  R  1,  Wl  and  DW  11  as 
the  three  most  obvious  candidates  for  election,  having  ratios-to-scale 
of  7.8,  4.4  and  3.4,  respectively,  each  of  which  is  above  the 
corresponding  1%  point  for  the  appropriate  bouquet  size  from 
Table  12.  These,  of  course,  had  the  highest  display  ratios  after  pre¬ 
trimming,  substantially  exceeding  the  levels  of  all  other  contrasts.  A 
horizontalized  plot  of  the  result  of  electing  Rl,  Wl  and  DW11,  post¬ 
trimming  the  R,  W  and  DW  bouquets  and  leaving  the  other  bouquets 
alone,  is  shown  in  Figure  7. 

The  vertical  axis  of  the  plot  has  been  truncated  at  50  to  show 
detail,  and  so  the  display  ratios  of  the  elected  contrasts,  which  we  are 
treating  as  individual  bouquets,  are  off  scale.  These  ratios  are  the 
same  as  in  Table  5  and  also  correspond  to  the  values  of  "scale”  shown 
in  the  inserts  of  the  horizontalized  plot.  This  plot  is,  of  course,  a 
middle  ground  between  the  horizontalized  plot  of  the  original 
bouquets  (Figure  2)  and  the  horizontalized  plot  of  the  pretrimmed 
bouquets  (Figure  3).  Depending  on  the  values  of  threshold  selected, 
we  could  have  also  considered  electing  DRW  111,  RW11  and  W2,  in 
that  order.  Since  the  threshold  values  for  suggesting  this  do  not 
reach  the  10%  points  of  Table  12  in  each  case,  we  shall  not  do  so,  but 
instead  will  turn  to  Figure  3. 
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On  looking  at  the  values  of  ratio-to-scale  for  the  largest  contrasts 
in  the  pretrimmed  bouquets  in  Figure  3,  we  see  that  R  2  and  W  2  may 
be  of  marginal  interest  and  that  no  other  contrasts  distinguish 
themselves. 

Clearly,  a  large  part  of  the  story  is  being  told  by  the  six  largest 
nominated  contrasts:  Rl,  IV 1,  DW11,  DRW  ill,  RVVll  and  Dl,  in 
apparent  order  of  importance.  Of  these,  we  have  already  super- 
elected  R 1  as  the  single  contrast  displaying  the  most  information 
about  the  experiment. 

It  is  possible  to  push  the  analysis  of  the  responses  for  person  IB1 
further,  but  in  order  to  do  so  more  readily,  and  to  demonstrate  more 
clearly  certain  other  characteristics  of  our  horizontalized  plots,  we 
will  reformulate  the  response.  First,  however,  we  summarize  what 
we  have  found  so  far  about  IBl's  responses. 


B5.  The  Final  Outcomes  for  the  First  56  Numbers 

The  previous  sections  have  indicated  that,  as  far  as  difference 
limen  in  grams  is  concerned,  the  relationship  between  the  responses 
of  IB1  and  the  various  factors  is  largely  captured  by  the  nominated 
contrasts  R 1,  Wl,  DW11,  DRW111,  RW11,  Dl  and  DR11.  Let  us 
model,  with  i  —  date,  j  —  rate  and  k  —  (initial)  weight,  the  response 
in  terms  of  the  linear-to-the-/  contrasts  as 

V.jk  “  <*o  +  bDd(i)  +  bRr(j)  +  bwu>(k)  + 

+  bDWd(i)w(k)  +  bRWr(j)w(k)  +  bDRd(i)r(j)  +  bDRWd(i)r(j)iv(k)  +  zi)k 

where,  for  convenient  comparison  of  coefficients,  we  have  taken 

2d(i)  -  2 r(j)  -  Zw(l t)  -  0 

Id  (if  -  1 

2r(,)2  -  3 

2w(kf  —  6  . 

Then  we  can  summarize  our  results  by 
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a0  -  centercept  -  50.4 
bR  -  rate  slope  -  24.6 

and  by  the  other  linear-to-the-h  ( h  —  1,  2,  3)  contrasts  in  the  2x2x2 
table: 


No  D 

D 

No  R  R 

No  R 

R 

No  W 

bD  -  3.5 

6dr  ” 

1.3 

W 

bw  “9.5  bfrw  “  ""3.7 

bp w  ™  — 11.3 

6pRW  — 

-7.8 

The  residuals  are 


Initial  Weight  (Grams) 


(gm/30  sec) 

Date 

100 

150 

200 

250 

300 

350 

400 

50 

1 

-1.3 

1.5 

3.0 

-2.8 

2.0 

2.3 

1.9 

2 

8.5 

0.3 

2.1 

05 

0.8 

0.8 

-0.4 

100 

1 

6.2 

0.4 

-8.4 

-8.4 

2.2 

0.0 

-0  1 

2 

-3.2 

3.8 

-0.8 

-49 

-5.9 

-1.8 

3  5 

150 

1 

2.6 

-5.9 

0.7 

3.5 

04 

1.6 

-6  4 

2 

-16.2 

-2.4 

-3.4 

1.6 

-109 

9.8 

2.7 

200 

1 

-4.8 

1.8 

7.2 

0.9 

-4.3 

0.9 

3.4 

2 

26.6 

14.5 

-22.1 

-20.7 

8.2 

-0.5 

9.7 

The  various  horizontalized  plots  show  us  that,  so  far  as  orthogonal 
polynomial  contrasts  go  —  simple  or  multiple  —  there  is  no 
appreciable  evidence  of  needing  a  more  detailed  description 

We  are  coming  out  as  we  should;  using  numbers  to  describe  our 
results  and  pictures  to  show  no  need  to  go  further  in  the  terms  we 
have  considered. 
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PART  C:  ANALYZING  IBl's  PERFORMANCE  IN  OTHER  TERMS 

Cl.  Reformulating  the  Response 

It  is  often  the  case  that  the  original  response  variable,  the  values 
of  which  were  recorded  (or  calculated)  in  the  process  of  the 
experiment,  is  not  the  best  variable  to  use  in  the  analysis.  It  is 
sometimes  possible  to  find  a  reformulation  of  the  response  data 
which  yields  a  simpler,  clearer  set  of  relationships  between  the 
dependent  variable  and  the  factors  of  the  experiment.  This  is 
achieved  when  there  are  fewer  important  interactions  and  when  the 
important  main  effects  become  more  pronounced  in  reference  to  the 
background  level.  This  can  sometimes  be  achieved  by  a  more 
trenchant  change  in  the  definition  of  the  response  which  involves  the 
values  of  one  or  more  factors  as  well  as  the  response,  one  that  may 
largely  remove  the  impact  of  a  previously  important  main  effect. 

Besides  seeking  to  simplify  the  relationships  necessary  to 
adequately  describe  the  data,  we  shall  be  delighted  if  we  can  also 
make  the  variability  of  the  response  variable  approximately 
homogeneous. 

In  considering  the  results  of  our  initial  analysis  of  the  response 
data  (as  D.L.  in  grams)  for  our  example  person  IB1,  we  noticed  that 
the  linear  contrast  for  the  rate  main  effect,  Rl,  was  far  and  away  the 
most  important  single  contrast  in  the  experiment.  A  look  at  Figure  1 
suggests  that  the  relationship  between  difference  limen  in  grams  and 
rate  might  be  close  to  being  a  proportional  one;  Table  13  confirms 
this.  In  this  table  we  show  the  average  of  the  difference  limen  values 
(in  grams)  for  IB1  for  each  level  of  rate,  averaging  across  all  levels  of 
data  by  weight  within  each  rate  level.  The  second  line  of  the  table  is 


TABLE  13.  Average  difference  limen  in  grams  by  rate  and  its  ratio  to  rate 
for  person  IB1. 


Rate  (grams/30  seconds) 

50 

100 

150 

200 

Average  D.L.  (in  grams) 

23.2 

39.7 

58.4 

80.5 

3Q(  averageDX  seconds) 

rate 

13.9 

11.9 

11.7 

12  1 
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the  ratio  of  this  average  level  in  grams  to  the  rate.  To  produce  simple 
units,  since  rate  is  measured  in  grams  per  30  seconds,  we  multiply  by 
30  so  that  the  response  is  now  in  seconds  of  time.  Since  the  original 
response  variable,  D.L.,  was  the  number  of  grams  of  water  added  to  a 
pail  at  a  constant  rate  until  the  person  could  detect  a  difference  in 
pull,  the  new  response  (given  in  the  second  line  of  Table  13)  is  a 
response  time. 

The  relatively  constant  response  time  (relatively  constant 
compared  to  a  change  from  23  grams  to  80  grams)  for  the  various 
levels  of  rate  implies  that  the  relationship  between  response  and  rate 
can  be  largely  explained  by  assuming  that  the  person  responds  after  a 
constant  time,  regardless  of  the  rate.  (This  was  found  by  Green  and 
Tukey  to  hold  in  a  collective  analysis  of  all  8  persons.) 

Since  the  large  rate  effect  can  be  substantially  explained  in  the 
above  manner,  we  can  obtain  a  simpler  analysis  by  changing  our 
response  from  difference  limen  (in  grams)  to 


response  time 


difference  limen 
rate 


x  30  (in  seconds). 


Re-Expressing  the  Reformulated  Response 

We  shall,  in  fact,  go  slightly  farther.  In  their  analyses  of  the 
entire  experiment.  Green  and  Tukey  found,  when  the  dependent 
variable  was  re-expressed  as  response  time,  that  the  standard 
deviations  were  approximately  linearly  related  to  the  means.  This 
can  be  seen  most  clearly  by  comparing  persons.  However,  a  plot  of 
the  residuals  versus  predicted  values  for  a  linear-to-the-y  fit  of  the 
response  time  for  our  example  person  also  shows  an  increase  in  the 
variability  as  the  predicted  response  time  increases.  In  order  to 
produce  more  homogeneous  variability.  Green  and  Tukey  re- 
expressed  the  reformulated  response  on  the  log  scale.  We  will  do  the 
same,  and  so  our  new  response  is 

log(response  time), 

the  natural  log  of  the  response  time  as  defined  above. 

Figure  8  shows  the  relationship  between  our  new  response 
variable  and  rate  within  each  combination  of  weight  and  date  levels. 
The  actual  data  values  appear  in  Table  14. 


Initial  Weight  (grams) 


(gm/30  sec) 

Date 

100 

150 

200 

250 

300 

350 

400 

50 

1 

2.68 

2.72 

2.71 

2.36 

2.52 

2.45 

2.34 

2 

3  21 

2.88 

2.84 

2.66 

2.53 

2.37 

2.08 

100 

1 

2.67 

2.51 

2.24 

2.21 

2.48 

2.40 

2.37 

2 

2.88 

2.89 

2.68 

2.44 

2.22 

2.14 

2.10 

150 

1 

2.50 

2.34 

2.45 

2.49 

2.44 

2.45 

2.29 

2 

2.72 

2.77 

2.63 

2.56 

2.13 

2.36 

1.9S 

200 


1 

2 


2.35  2.44  2.51  2.44  2.37  2.45  2.48 

3.10  2.92  2.40  2.23  2.46  2.13  2.08 
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C2.  A  First  Analysis  of  Log  Response  Time 

Having  reformulated  our  response  from  D.L.  (in  grams)  to  log 
response  time  (in  log  seconds),  we  proceed  with  an  analysis  of  the 
apparent  relationships  involving  the  various  factors  (data,  rate  and 
weight,  as  before)  via  the  horizontalized  plotting  techniques.  As  an 
initial  step,  we  will  use  the  orthogonal  polynomials,  as  before,  to 
produce  the  single-degree-of-freedom  contrasts  which  define  our 
various  bouquets.  As  in  the  analysis  of  D.L.  in  grams,  we  nominate 
the  linear-to-the-/  contrasts  D  1,  R  1,  IV 1,  DR  11,  DW11,  RW11  and 
DRW111  as  a  priori  important  and  pretrim  our  bouquets.  Going 
through  the  same  steps  as  in  the  analysis  for  D.L.,  we  reach  the  plots 
of  Figure  9.  (We  note,  in  passing,  that  RW11  and  DRW  111  are  now 
not  the  largest  contrasts  in  their  original  bouquets  —  we  had  already 
nominated  them  anyway.) 

In  our  plots  of  the  display  ratios  (in  units  of  log  seconds)  in 
Figure  9,  we  have  truncated  the  vertical  axis  at  0.6  in  order  to  better 
show  the  detail.  This  truncation  has  put  the  two  largest  display 
ratios,  W1  and  DW11,  off  scale.  The  display  ratios  for  the  nominated 
contrasts,  which  we  are  treating  as  7  separate  bouquets,  are  listed  in 
Table  15.  Also  included  in  Table  15,  for  comparison  purposes,  are  the 
display  ratios  (in  units  of  grams)  from  the  pretrimmed  analysis  of  the 
original  response  (in  grams),  as  well  as  the  median  display  ratio 
across  all  55  contrasts  for  both  responses  and  the  ratio  of  display  ratio 
to  these  medians. 


TABLE  15.  Display  ratios  (in  units  of  log,  seconds)  for  the  polynomial 
analysis  of  log  response  time  for  person  1B1  (display  ratios  for 
D.L.  in  grams  included  for  comparison). 


Log  Response  Time 

D.L.  In  Grams 

Contrast 

Display  Ratio 
(log.  Seconds) 

Ratio  To 
Median 

Display  Ratio 
(Grams) 

Ratio  To 
Median 

W 1 

1.97 

16.4 

98 

10.9 

DW 11 

1.42 

11.8 

82 

9.1 

R  1 

.54 

4.5 

237 

26.3 

D  1 

.34 

2.8 

28 

3.1 

RW11 

.33 

2.8 

33 

3.7 

DRW  111 

.24 

2.0 

49 

5.4 

DR  11 

.16 

1.3 

9 

1.0 

Median  Display 
Ratio  Over 

All  55  Contrasts 

.12 

9 
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Comparison  of  Analyses 

We  can  see  that  the  relative  importance  of  the  R  1  contrast  has 
been  considerably  reduced  by  the  reformulation  but  that,  in 
comparison  to  the  median  display-ratio,  it  still  merits  appreciable 
attention.  The  relative  apparent  importance  of  the  weight  and  date- 
by-weight  linear  contrasts  remains  high  and  has,  in  fact,  been 
increased  by  going  to  log  response  time.  The  relative  importance  of 
the  remaining  four  nominated  contrasts  has  generally  been  decreased. 

Table  16  shows  the  display  ratios  and  ratios-to-scale  for  the 
nominated  contrasts  when  they  are  treated  as  the  7-contrast 
nominated  bouquet.  Based  on  the  sizes  of  their  ratios-to-scale,  we 
might  conceivably  consider  super-electing  W1  and  DW11  as  the  most 
important  contrasts,  but  their  relative  importance  over  the  remainder 
of  the  nominated  contrasts  seems  somewhat  slight. 

Examining  the  ratios-to-scale  for  the  largest  sizes  of  contrasts 
within  the  trimmed  bouquets,  we  find  these  to  be  generally  small,  the 
largest  being  1.4  for  RW32,  the  cubic-by-quadratic  interaction  for 
rate-by-weight,  and  also  for  DW 16,  the  linear-by-6th-degree 
interaction  for  date-by-weight.  Because  of  the  relatively  small  sizes 
of  these  ratios-to-scale  and  because  of  the  unpromising  nature  of  the 
contrasts  associated  with  them,  we  will  not  elect  either  one. 


General  Levels 

In  looking  at  the  general  level  of  the  display  ratios  for  the 
trimmed  bouquets  (also  reflected  by  their  value  of  "scale"  indicated 
on  the  plots),  we  feel  that  for  the  most  part  these  display  ratios  are 
measuring  noise.  There  are  two  exceptions  —  the  first  being  the 
relatively  large  value  of  the  display  ratios  of  the  smallest  size  of 
contrast  within  the  three-factor  bouquet  —  which  we  again  take  as 
measuring  some  type  of  isolated  error  or  granularity  and  accordingly 
ignore.  The  other  exception  corresponds  to  the  trimmed  rate 
bouquet,  measuring  the  quadratic  and  cubic  contributions  of  rate,  and 
is  of  more  interest.  The  levels  of  the  display  ratios  for  both  the 
contrasts  in  this  bouquet  are  nearly  the  same  and  are  noticeably 
above  the  general  level  ("scale")  of  the  other  trimmed  bouquets  and 
above  the  levels  of  two  of  the  nominated  contrasts.  This  type  of 
general  inflation  of  all  levels  of  a  trimmed  bouquet,  with  no  contrast 
indicated  as  individually  important,  could  be  indicative  of  a  particular 
phenomenon,  the  discussion  of  which  we  turn  to  next. 
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TABLE  16.  Display  ratios  (in  units  of  log,  seconds)  for  the  nominated 
contrasts,  treating  them  as  a  7-contrast  nominated  bouquet. 


Contrast 

Log  Response  Time  For  Person  IB1 

Display  Ratio 

Working  Value 

Ratio-to-Scale 

W1 

.79 

(1691) 

1.42 

DW  11 

.80 

(1.208) 

1.43 

R  1 

.41 

(.908) 

.73 

Dl 

.34 

(.674) 

.61 

RW  11 

.48 

(.472) 

.86 

DRW  11 

.55 

(.288) 

1.00 

DR  11 

.97 

(114) 

1.74 

scale  —  .55 
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Dl.  Spreading  of  Contributions  across  Contrasts 

We  have  previously  stated  that  each  trimmed  bouquet,  at  the  end 
of  post-trimming,  where  the  ratios-to-scale  of  the  remaining  contrasts 
are  not  sufficiently  large,  is  deemed  to  consist  of  contrasts  which 
individually  are  not  imparting  significant  information  about  the 
relationships  between  the  response  variable  and  the  collective 
components  of  the  factor(s)  embodied  in  the  (trimmed)  bouquet.  This 
need  not  mean  that  the  trimmed  bouquet  is  certain  not  to  be  telling 
us  anything  of  importance  about  the  data. 

The  elected  contrasts  from  post-trimming  are  each,  potentially, 
individually  representing  systematic  relationships  in  the  data  (of 
varying  degrees  of  strength  depending  on  their  sizes  of  trimmed-out 
display  ratios).  It  is  possible  that  there  are  other,  real,  systematic 
relationships  within  the  data  which  are  not  individually  captured  by 
any  of  the  various  contrasts  which  have  been  selected  to  define  the 
(original  untrimmed)  bouquets.  Such  a  systematic  relationship  is 
then  jointly  indicated  by  a  number  of  contrasts,  and  its  actual  size  is 
spread  across  those  contrasts. 

The  effect  of  such  a  situation  might  be  a  bouquet  (trimmed  or 
untrimmed)  where  the  general  level  of  the  display  ratios  (the  scale)  is 
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at  the  background  noise  level  and  where  the  largest  few  sizes  of 
contrast  correspond  to  the  contrasts  jointly  indicating  the  systematic 
relationship.  Because  this  relationship  is  spread  among  a  number  of 
the  contrasts,  it  is  possible  that  not  all,  or  even  none,  of  these 
contrasts  will  be  flagged  in  post-trimming  as  individually  indicating 
an  interesting  contribution. 

If  the  systematic  relationship  is  spread  across  enough  of  the 
defining  contrasts  for  a  bouquet,  the  scale  for  that  bouquet  will  be 
inflated,  and  the  plot  of  the  display  ratios  for  the  bouquet  will  be  at  a 
general  level  above  the  background  noise  level,  although  no 
individual  contrasts  in  the  bouquet  may  be  flagged  as  individually 
interesting. 

It  is  sometimes  possible  —  either  in  these  two  circumstances  or, 
better,  initially  —  to  select  a  different  bouquet  of  defining  contrasts 
for  the  line  of  the  analysis-of-variance  table  which  will  more  nearly 
isolate  the  systematic  contributions,  each  into  a  single  (different) 
contrast,  and  result  in  a  potentially  simpler  account  of  the 
interrelationships  in  the  data. 


D2.  Some  Useful  Bouquets  of  Contrasts 

Some  interesting  bouquets  of  orthogonal  contrasts  emphasize  the 
ordering  of  the  versions  of  a  factor.  A  classical  example  of  those 
using  only  order  are  the  Helmert  contrasts,  which,  for  example,  may 
be  formulated  to  compare  the  response  value  of  the  first  version  with 
the  average  of  the  remaining  versions,  the  value  of  the  second  with 
the  average  of  all  but  the  first  two,  and  so  on,  the  last  such  contrast 
comparing  the  response  value  of  the  next-to-last  version  with  that  of 
the  last  version.  We  will  call  these  Helmert  SFP  contrasts,  for 
“Starting  with  First  Point."  There  are  also  Helmert  SLP  contrasts, 
starting  with  the  last  point.  In  our  situation,  however,  we  would  like 
to  use  both  order  and  value.  In  particular,  we  will  often  want  to 
include  the  linear  contrast. 


LPO's  and  FPO's 

An  interesting  alternative  was  recently  considered  by  Daniel 
(1985),  who  notes  that  if  a  set  of  m  responses  at  equally  spaced 
versions  of  a  factor  are  nearly  linear  in  that  factor,  then  a  commonly 
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observed  deviation  from  linearity  is  localized  at  one  end,  the 
remaining  m—  1  points  falling  close  to  a  straight  line.  He  defines  the 
contrast  LPOm,  for  Last  Point  Off  of  m,  which  measures  the  deviation 
of  the  mth  point  in  the  sequence  from  its  predicted  value  based  on  a 
least  squares  line  through  the  previous  m—  1  points.  Table  17  shows 
these  Last  Point  Off  contrasts  for  m  —  3,  ...,  7.  Given  n  equispaced 
versions  of  a  factor,  the  collection  { LPO„ ,  LPO„-i,  ...,  LP03,  L)  defines 
a  set  of  orthogonal  contrasts,  where  L  is  the  ordinary  linear  contrast 
and  LPO,  compares  the  observed  value  at  the  ith  level  with  the 
predicted  value  from  the  line  through  the  values  for  the  first  i  —  1 
levels. 

General  coefficients  for  LPOm,  m  ^  3: 

ith  value  —  m  +  1  —  3i,  i  <  m 
end  (mth  value)  —  V4(m— l)(m— 2) 

Sum  of  Squares  —  'A  (m—2)(m—l)m(m+l)  . 


If  we  use  the  last  column,  labeled  "(/)",  in  Table  17  as  our 
ordering,  we  get  Daniel's  First  Point  Off  contrasts  Ordinarily,  in 
those  circumstances  where  LPO  or  FPO  contrasts  are  likely  to  be 
helpful,  either  advance  insight  or  data  behavior  will  make  it  clear 
which  to  select.  But  there  may  be  doubt. 


TABLE  17.  Last  Point  Off  contrasts  for  equally  spaced 
levels  illustrated  for  m  from  3  to  7  (*  marks 
special  point). 


i 

LPO  j 

LPO  4 

LPO, 

LPO t 

LPO  7 

(/) 

i 

I 

2 

3 

4 

5 

7 

2 

-2 

-1 

0 

1 

2 

6 

3 

1* 

-4 

-3 

-2 

-1 

5 

4 

3* 

-6 

-5 

-4 

4 

5 

6* 

-8 

-7 

3 

6 

10* 

-10 

2 

7 

15* 

1 

SSq 

6 

30 

90 

210 

420 
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EPO's 

In  such  doubtful  cases,  the  EPO  contrasts,  for  End  Point  Off,  which 
treat  both  ends  more  nearly  symmetrically,  may  be  in  order.  These 
can  be  easily  made  up  from  FPO's  and  LPO' s.  Table  18  shows 
examples  for  n  —  7  and  n  —  6,  where  the  slight  "preference"  has  been 
given  to  LPO's.  (In  the  world  as  a  whole,  we  believe  there  is  at  least 
as  much  curvature  near  the  upper  end  of  the  range  as  near  the  lower 
end.) 


TABLE  18.  Double-ended  (EPO)  contrasts  for  equally  spaced  levels  for 
n  —  7  and  n  -  6  (*  marks  special  point  —  note  slight 
preference  for  LPO). 


n  -  7 
Rank 

L 

m  —  7 

m  —  6 

m  —  5 

m  —  4 

m  —  3 

1  of  7 

-3 

5 

10* 

- 

- 

- 

2  of  7 

-2 

2 

-8 

3 

3* 

- 

3  of  7 

-1 

-1 

-5 

0 

-4 

i 

4  of  7 

0 

-4 

-2 

-3 

-1 

-2 

5  of  7 

1 

-7 

1 

-6 

2 

1* 

6  of  7 

2 

-10 

4 

6* 

- 

- 

7  of  7 

3 

15* 

- 

- 

- 

- 

SSq 

28 

420 

210 

90 

30 

6 

Identity 

L 

LPO  7 

fpo6 

LPO  5 

FPO* 

LPO  3 

n  —  6 


Rank 

L 

m  -  6 

m  —  5 

■n 

DOHi 

1  of  6 

-5 

4 

6* 

- 

- 

2  of  6 

-3 

l 

-6 

2 

i* 

3  of  6 

-1 

-2 

-3 

-1 

-2 

4  of  6 

1 

-5 

0 

-4 

1 

5  of  6 

3 

-8 

3 

3* 

- 

6  of  6 

5 

10* 

- 

- 

- 

SSq 

70 

210 

90 

30 

6 

Identity 

L 

LPO 6 

FPO, 

LPO  4 

FPO  j 
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SEPO'S 


If  it  is  really  important  to  have  symmetry,  to  the  extent  that  we  do 
not  mind  irrational  coefficients,  we  can  arrange  for  it  by  combining 
pairs  of  LPO's  and  FPO' s.  We  will  call  these  SEPO  contrasts,  for 


Symmetric  End  Point  Off.  For  n  —  7,  for  example,  we  can  use  U 
FPO 7  +  FP06  yfl,  LP07+LP06V2,  FP05  +  FP04V3,  LPO s  +  IPO 4  V3. 
LP03—  FP03.  In  these  contrasts,  both  FP07  and  FP06  treat  the  first 
data  point  (of  the  7)  as  special,  LP07  and  LPOb  the  last  data  point, 
FPO  5  and  FP04  treat  the  second  data  point  (of  the  original  7)  as 
special  and  compare  it  with  the  next  4  and  3  points  respectively,  and 
so  on.  For  either  initial  or  intermediate  values  of  m  >  4,  the  First 

Point  SEPO  combination  is  FPOm  +  FPOm-t  - — 

tti  — 3 

sum  of  squared  coefficients  m  — —  [(m  —  1)  +  V(m+l)(m— 3)  ]. 


Double-Ended  Helmert  Contrasts 

We  can  also  define  a  double-ended  set  of  Helmert-type  contrasts, 
as  illustrated  in  Table  19. 

General  coefficients  for  m  non-zero  entries  ( m  >  3): 

,  ,  m-  2  .  yJm(m  —2) 

end  values  —  -  ±  - - 

2  2 

inner  values  —  —  1 , 

Sum  of  Squares  “  m  (m  —2) . 


D3.  Alternative  Descriptions  of  the  (Log-Response)  Rate  Effect 

We  have  noted  that  the  trimmed  rate  bouquet  (consisting  of  R  3 
and  R2)  has  display  ratios  which  are  nearly  equal  and  which  appear 
high  relative  to  the  assumed  background  level.  From  Figure  9  we  can 
see  that  the  general  level  for  the  trimmed  rate  bouquet,  as  measured 
by  its  scale,  .285,  is  3  times  that  of  the  apparent  background  level 
(.092)  as  measured  by  the  median  scale  of  the  6  trimmed  bouquets.  A 
pattern  of  display  ratios  such  as  this,  given  its  high  level  relative  to 
the  background,  is  suggestive  of  the  spreading  of  a  systematic 
relationship  across  contrasts. 
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In  fact,  looking  at  the  values  of  the  size  of  contrast  for  the 
polynomial  decomposition  of  the  rate  main  effect,  given  in  Table  20, 
we  see  that  the  linear  and  quadratic  contrasts  are  both  relatively  large 
and  of  roughly  the  same  magnitude. 

Since  the  rate  main  effect  has  only  three  degrees  of  freedom,  this 
near  equality  of  the  two  largest  sizes-of-contrast  suggests  that  another 
orthogonal  decomposition  might  produce  a  simpler  description  of  the 
relationship  between  log  response  time  and  rate.  To  help  understand 
if  this  is  possible,  we  consider  Figure  10,  the  plot  of  average  values  of 
log  response  time  by  rate  for  each  date  (averaging  across  weight 
within  each  combination  of  rate  and  date).  The  plot  firstly  shows  a 
similar  relationship  between  average  log  response  time  and  rate 
within  date,  with  a  minor  difference  in  the  slopes  of  the  simple  linear 
fits  of  log  response  time  vs  rate  —  accounting  for  the  moderate  DR  11 
display  ratio.  More  importantly,  within  a  given  date  the  levels  of  the 
response  variable  for  the  latter  three  rates  (100,  150  and  200)  are  all  at 
roughly  the  same  value  and  notably  lower  than  the  level  of  log 
response  time  for  the  rate  of  50  grams/ 30  seconds. 


Results  for  Various  Bouquets 

This  latter  observation  suggests  that  a  different  bouquet  of 
orthogonal  contrasts  might  usefully  be  considered  in  defining  the 
rate  effect.  Specifically,  we  want  a  set  of  contrasts  which  emphasize 
the  ordering  of  the  levels  of  the  factor,  such  as  those  given  in  the  last 
section. 

Table  21  shows  the  results  of  applying  a  number  of  different 
bouquets  of  contrasts  to  the  average  values  of  the  log  response  time 


TABLE  20.  Values  of  the  size  of  contrast  for  the 
polynomial  decomposition  of  the  rate 
main  effect  for  IB1  (log  response 
time). 


Size  of  Contrast 

Contrast 

(log.  Seconds) 

R  1 

.365 

R  2 

.315 

R  3 

.100 
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for  each  of  the  rates.  These  averages  are  shown  as  the  first  line  of  the 
table.  The  remainder  of  the  table  consists  of  the  sizes  of  contrast  and 
display  ratios  that  would  be  obtained  if  each  of  the  bouquets  given  in 
the  exhibit  were  used  in  defining  the  rate  main  effect.  The  sizes  of 
contrast  are  those  which  would  be  obtained  from  a  full  analysis  (and 
are  ~J\A  times  the  normalized  values  which  would  be  obtained  by 
applying  the  contrasts  to  the  averages  by  rate).  The  display  ratios  are 
for  the  original  three-contrast  bouquet  (with  working  values  1.282, 
.674,  and  .253).  We  have  arranged  the  terms  in  each  set  of  contrasts 
so  that  the  resulting  sizes  of  contrast  come  out  in  descending  order. 
(Notice  that  EPO  and  LPO  are  identical  for  n  —  4.)  Clearly,  in  view  of 
the  display  ratios,  if  spreading  is  occurring  with  the  polynomial 
bouquet,  then  it  is  also  occurring  with  the  EPO,  LPO,  FPO,  and  SEPO 
bouquets  of  contrasts.  The  common  element  is,  of  course,  the 
inclusion  of  the  linear  contrast. 


Figure  10.  Person  IB  1  log  response  time  vs  rate,  by  date  (lines  are  the  linear 
fit). 


TABLE  21.  Values  of  size  of  contrast  and  display  ratio  for  various  bouquets  of  orthogonal  decompositions  of  rate 
effects  (person  IB1  —  log  response  time). 
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If  we  consider  the  display  ratios  for  the  Helmert  SFP  bouquet,  we 
see  that  the  majority  of  the  information  about  the  effect  of  rate  on 
log  response  time  is  captured  in  the  first  contrast,  the  one  comparing 
the  response  at  rate  50  with  the  average  of  the  other  responses. 
While  it  is  difficult  to  justify  nominating  this  contrast,  we  can 
assuredly  elect  it,  as  its  ratio-to-scale  is  5.7.  (According  to  Table  12, 
5.7  is  beyond  the  0.5%  level.  Thus  if  we  make  an  allowance  for 
multiplicity  of  between  6  —  the  number  of  alternative  bouquets  in 
Table  21  —  and  8  —  the  largest  number  of  alternative  bouquets  we 
might  reasonably  have  considered,  we  are  still  well  beyond  5%.)  The 
double-ended  Helmert  contrasts  also  produce  much  the  same  result. 


D4.  The  Example  after  Rescission 

Adopting  the  Helmert  SFP  contrasts  as  our  scission  of  rate,  and 
appropriately  adjusting  the  definitions  of  all  two-  and  three-factor 
contrasts  which  involve  rate,  will  produce  the  horizontalized  plots 
shown  in  Figure  11. 

In  the  plots  we  use  the  following  notation:  r  1  is  the  Helmert  SFP 
contrast  comparing  the  first  rate  (i.e.,  50  gms/30  seconds)  with  the 
average  of  the  remainder,  r2  compares  the  second  rate  with  the 
average  of  3rd  and  4th,  and  r 3  compares  the  third  rate  with  the 
fourth.  The  two-factor  interactions  involving  rate  are  obtained  as  the 
outer  product  of  the  Helmert  contrasts  for  rate  and  the  polynomial 
contrasts  for  the  other  factor.  Thus,  rWll  is  the  interaction  involving 
the  first  Helmert  contrast  for  rate  and  the  linear  polynomial  contrast 
for  weight.  Similarly,  the  factor  contrast,  DrW  123,  combines  the 
day-to-day  difference,  the  second  Helmert  contrast  for  rate,  and  the 
cubic-polynomial  contrast  for  weight. 

As  we  have  been  doing  all  along,  we  have  nominated  the  linear- 
to-the-/  contrasts  which  do  not  involve  rate  (i.e.,  Dl,  W1  and  DW11) 
as  a  priori  potentially  important  contrasts.  As  mentioned  above,  it  is 
difficult  to  justify  nomination  of  any  of  the  Helmert  contrasts,  which 
is  why  no  contrasts  involving  rate  have  been  nominated.  We  have, 
however,  elected  r  1  as  an  a  posteriori  important  contrast  in  view  of  its 
ratio-to-scale  within  the  full  three-contrast  Helmert  SFP  bouquet.  No 
other  contrasts  involving  rate  can  be  elected  for  any  reasonable 
threshold  value. 

Using  the  Helmert  contrasts  as  our  scission  of  rate  has  produced  a 
simplification  in  the  apparent  relationship  between  the  response  and 
the  three  factors.  This  rescission  has  eliminated  the  apparent 
importance  of  any  interaction  involving  rate  in  an  adequate 
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Figure  11.  Person  IB1  —  log  response  time  in  Ln  seconds.  Helmert  contrasts  for  rate 
polynomial  contrasts  for  weight.  Wl.DVVll  nominated  —  rl  elected. 
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description  of  the  data  and  has  isolated  the  (main)  effect  of  rate  into 
the  comparison  of  the  response  at  the  lowest  rate  with  the  average  of 
the  responses  at  all  other  rates. 


D5.  Refactoring 

Our  general  attitude  of  "redoing  anything  that  seems  to  deserve  it, 
at  least  on  a  trial  basis"  should  by  now  be  clear.  We  have  considered, 
in  increasing  order  of  drasticness:  rescission  into  contrasts,  re- 
expression  of  our  response,  (by  implication  at  least)  re-expression  of 
our  factors,  and  reformulation  of  the  response.  Well  along  in  this 
order  we  should  also  consider  another  redo,  refactoring  of  the  pattern 
of  analysis.  Actually,  as  we  shall  soon  see,  our  example  already 
illustrates  this. 


Splitting 

The  earliest  approach  to  the  simplest  sort  of  refactoring  seems  to 
be  that  of  Brownlee's  (1947)  World  War  II  concise  book  Industrial 
Experimentation,  which  was  heavily  concerned  with  2*  designs  with 
k  -  4,  5  or  6  factors.  Brownlee  rightly,  we  feel,  emphasized  the 
frequent  advantages  of  "splitting  the  experiment"  and  then  analyzing 
and  discussing  the  halves  separately.  This  is  particularly  likely  to 
help  when  the  two  versions  of  a  factor  were  "do  Q"  and  "don't  do 
Q,"  and  somewhat  less  likely  to  help  when  the  versions  were  "the 
high  level  of  Y"  and  "the  low  level  of  Y." 

Brownlee  decided  whether  or  not  to  split  in  terms  of  the 
appearance  of  a  significant  interaction,  which  was  not,  for  this 
purpose,  compared  with  the  mean  square  above  it.  We  would  feel 
that  the  proper  reason  for  splitting  is  having  something  just  above, 
the  size  of  whose  effect  (or  mean  square)  is  not  much  larger  than  the 
interaction  (which  does  itself  need  to  appear  not  to  be  pure  error). 
This  formulation  makes  it  much  clearer  what  is  to  be  split  —  those 
factors  in  the  substantial  interaction  which  do  not  appear  in  the  label 
of  the  similar-sized  mean  square  above. 

In  Brownlee's  case  —  2*  for  small  Jt,  high-order  interactions  for 
error  —  it  was  not  easy  for  an  interaction  to  be  significant,  and  if 
significance  was  reached,  it  was  not  easy  for  the  main  effects  to  be 
considerably  still  larger  (unless  they  were  known  about  all  the  time). 
Thus,  for  2k,  his  approach  usually  led  to  decisions  to  split  that  would 
also  be  made  on  the  basis  of  what  we  assert  to  be  appropriate  reasons. 
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The  fact  that  this  would  not  be  true  for  designs  whose  factors  have 
several  versions  may  account  for  the  disappearance  of  the  idea  of 
splitting,  both  from  Brownlee's  later  books  and  from  the  literature 
generally. 


Splitting  into  Persons 

The  experiment  from  which  our  example  is  drawn  was  designed 
as  a  crossing  of  the  sort  of  2  x  4  x  7  treatment  pattern  we  have  been 
analyzing  for  a  single  person  by  a  pattern  involving  8  subjects.  The 
subject  pattern  involved  2  persons  in  each  cell  of  a  2  x  2  for  male  vs. 
female  and  seeing  vs.  blind.  (The  original  failure  to  make  a  sensible 
analysis  corresponded  to  an  a  priori  assumption  that  replication  of 
persons  within  cell  belonged  in  the  lowest  error  term,  quite  contrary 
to  the  trustworthy  maxim  that  “people  will  be  different!") 

Actually,  the  8  people  did  behave  quite  differently,  both  in  slopes 
against  individual  factors,  and  in  difference  of  slopes  from  day  to  day. 
Splitting,  at  least  initially,  the  data  into  8  portions,  one  for  each 
person,  seems  to  be  an  essential  step  in  understanding  what  is  going 
on.  This  is  a  simple  and  important  instance  of  refactoring. 

Once  we  have  done  this,  we  can  look  at  sets  of  8  numbers,  one  for 
each  person,  for  both  individual  and  collective  responses,  and  ask 
what  they  seem  to  show,  particularly  in  terms  of  the  imposed  2*2 
design.  In  general,  we  see  little  associated  with  the  factors  of  sex  and 
sight  (somewhat  confounded,  as  they  were,  with  age)  but  strong 
emphasis  on  "people  will  be  different". 

We  turn  briefly  to  the  question  of  seeking  limited  consistency  of 
behavior  across  persons  in  Part  E. 


Tacit  Refactoring 

Actually,  of  course,  the  original  data  of  Johnson  and  Tsao  is  best 
thought  of  as  having  already  been  refactored,  perhaps  unwisely, 
before  anyone  else  ever  saw  any  of  it.  The  original  8  x  56  table, 
eventually  given  in  Johnson's  (1949)  book,  for  8  persons  and  5o 
condition-date  combinations,  was  a  table  of  means,  each  of  5 
individual  trials.  The  order  of  trial  of  the  5  repetitions  of  2S 
conditions  for  each  person  was  stated  to  be  randomized,  though  no 
details  were  reported.  It  would  be  strange  indeed,  in  view  of  all  the 
other  things  that  appear  to  have  been  going  on  in  this  data,  if  there 
had  been  no  time  trends  associated  with  the  140  -  5  x  28  trials  for 
each  of  the  16  —  2  x  8  date-person  combinations.  The  effects  of  these 
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trends  were  buried,  in  an  unknown  but  reputedly  random  way,  in 
the  table  of  means,  which  Is  all  that  later  analysts  have  had  at  their 
disposal.  In  a  real  sense,  this  is  also  an  instance  of  refactoring  —  if 
not  of  something  still  more  drastic. 

We  believe,  then,  that  this  example  illustrates  —  in  more  than  one 
way  —  the  need  for,  and  importance  of,  refactoring  as  a  standard  part 
of  an  analyst's  tool  kit. 


Splitting  on  "Date"  in  the  Example 

In  the  IB1  data,  both  the  weight  slope  and  the  date-by-weight 
slope,  if  not  nominated,  would  be  elected.  Indeed,  the  latter  (DW11) 
is  nearly  as  large  as  the  former  (Wl). 

The  great  difference  in  weight  slopes  for  this  subject  from  one  day 
to  the  other  is  easily  seen  in  a  simple  plot  of  means  for  date-weight 
combinations,  as  in  Figure  12.  This  suggests  splitting  on  date. 


Wq i gh t 

Plot  Symbol  j*  Oottf 


Figure  12.  Person  IB1  log  response  time  vs  weight  separated  by  date  (lines 
are  the  linear  fit). 
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The  classical  analysis  of  variance,  after  nomination  of  all  linear- 
to-the-/  contrasts  (now  only  for  R,  W,  and  RW)  would  take  the  form 
shown  in  Table  22  (using  parallel  columns  for  the  two  dates): 


TABLE  22.  Parallel-column  analysis-of-variance  table  splitting  on  date. 


Label 

df 

(Date  1) 

(Date  2) 

(Rate  1) 

MS 

(Rate  2) 

Rate  slope 

1 

1 

.0322 

.1130 

Weight  slope 

1 

1 

.0704 

2.6104 

Interaction  slope 

1 

1 

.0737 

.0022 

Trimmed  Rate 

2 

2 

.0203 

.0362 

Trimmed  Weight 

5 

5 

.0066 

Trimmed  Interaction 

17 

17 

.0114 

It  seems  natural  to  present  three  panels  of  display  ratios,  one  for 
rate,  one  for  weight,  and  one  for  interaction,  as  in  Figure  13,  where 
we  are  treating  each  of  the  nominated  contrasts  (dlRl,  dlWl  and 
dlRWll  for  day  I,  d2Rl,  d2Wl  and  d2RWll  for  day  2)  as  separate 
one-contrast  bouquets.  The  display  ratio  for  the  largest  of  these  — 
d2Wl,  the  weight  slope  for  day  2  —  is  2.4,  which  is  4.8  times  larger 
than  that  of  the  next  largest  nominated  contrast  (d2R  1). 


Super-Election  in  the  Split  Example 

Collecting  the  6  nominated  contrasts  into  a  six-contrast  nominated 
bouquet  and  then  computing  display  ratios  produces  the  first  two 
columns  of  Table  23.  Quite  clearly,  the  weight  slope  for  day  2  stands 
out  from  the  rest.  (The  ratio-to-scale  for  d  2W1  in  the  6-contrast 
bouquet  is  2.4.)  Electing  this  contrast  and  post-trimming  produce  the 
last  two  columns  of  Table  23. 

If,  instead,  we  use  the  3-contrast  Helmert  SFP  bouquet  as  our 
scission  of  rate,  we  get  the  horizontalized  plots  of  Figure  14.  In  this 
exhibit,  we  have  nominated  the  linear  weight-within-date  contrasts 
(dlWl  and  d2Wl)  and  elected  the  two  rate-50-versus-the-rest 
Helmert  SFP  contrasts  of  rate  within  date  (dlrl  and  d2r\).  The 
election  used  a  threshold  value  of  2.  Notice  that  no  rate-by-weight 
interaction  can  be  elected  with  any  reasonable  threshold.  As  before, 
the  use  of  the  Helmert  contrasts  has  concentrated  the  relationship 
between  response  and  rate  largely  into  a  single  contrast. 
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Figure  13.  Person  IB1  —  log  response  time  in  Ln  seconds.  Splitting  on  date;  polynomial 
contrasts  for  rate  and  weight  within  date. 
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Combining  the  two  nominated  and  the  two  elected  contrasts  into 
a  4-contrast  nominated-plus-elected  bouquet,  we  get  Table  24.  The 
ratio-to-scale  of  d2Wl,  the  linear  weight-within-date-2  contrast, 
within  the  4-contrast  bouquet  is  1.36,  so  that  while  this  contrast  is 
notable,  it  is  not  outstandingly  large  within  the  nominated-plus- 
elected  bouquet.  In  the  next  section,  we  will  give  our  final 
interpretation  of  the  IB1  data,  based  on  what  we  have  now  found. 


TABLE  23.  Display  ratios  for  the  nominated  bouquet  after  splitting  on 
date. 


Contrast 

6-Contrast  Bouquet 

Electing  d  2W  1 
and  Post-Trimming 

Display 

Ratio 

Working 

Value 

Display 

Ratio 

Working 

Value 

d2Wl 

.998 

(1.620) 

2.398 

(.674) 

d2R  1 

.300 

(1.119) 

.219 

(1.534) 

d  1RW  11 

.337 

(.804) 

.268 

(1.010) 

dlWl 

.477 

(.555) 

.393 

(  674) 

dlR  1 

.533 

(.336) 

.445 

(.402) 

d2RWll 

.356 

(-132) 

.299 

(157) 

scale  —  .416 

(trim)  scale  —  .299 

TABLE  24.  Display  ratios  for  the  nominated-plus-elected 
bouquet  after  splitting  on  date  with  Helmert 
contrasts  for  rate. 


Contrast 

4-Contrast  Bouquet 

Display  Ratio 

Working  Value 

d  2  W 1 

1.133 

(1.426) 

d2rl 

.49'1 

(.869) 

dlWl 

.530 

(  502) 

dir ! 

i.366 

(194) 

scale  —  .831 
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Figure  14.  Person  IB1  —  log  response  time  in  Ln  seconds.  Splitting  on  date.  Heimert  contrasts 
for  rate  —  polynomial  contrasts  for  weight. 
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Two  Further  Illustrations 

If  we  give  no  other  example,  the  choice  of  the  term  "refactoring" 
may  come  into  question.  So  we  offer  a  few  possibilities  for  2  x  2 
tables:  one  where  an  interaction  becomes  a  factor  and  vice  versa: 


to  "no" 
to  "yes" 


Before  Before 


"no" 

"yes" 

"no" 

"yes" 

stay 

move 

stay 

to  "no" 

to  "yes" 

^  and 

move 

stay 

move 

! 

to  "yes" 

to  "no" 

and  one  where  an  interaction  is  transferred  into  (a)  modifying  the  A 
effect  (b)  modifying  the  B  effect  and  (c)  inserting  a  bonus  in  one 
chosen  cell  (can  be  any  one  of  the  four  (cf.  Seheult  and  Tukey,  1982)) 


A  main  effect 
B  main  effect 
AB  interaction 


A  effect 
B  effect 

A— and— B— both-high  bonus 


D6.  Recapitulation 

Before  considering  the  complete  data  set  (of  all  8  persons)  and 
then  summarizing  the  main  thrust  of  this  paper,  it  is  likely  to  be 
helpful  to  recapitulate  the  steps  we  have  taken,  as  seen  in  the  light  of 
where  we  now  stand. 

In  Part  A  we  discussed  the  analysis  of  the  original  2  x  4  x  7  —  56 
numbers  in  terms  that  could,  indeed  should,  have  been  planned  in 
advance  of  seeing  those  numbers.  Our  analysis  focussed  on  pictures 
corresponding  to  various  analyses  of  variance,  where  the 
correspondence  was  mediated  by  (a)  scission  of  each  "line"  into  a 
bouquet  of  contrasts  and  (b)  a  horizontalized  version,  employing 
display  ratios,  of  Daniel's  half-normal  plot.  We  did  this  for  naive  and 
linear-nominated  analyses  of  variance,  both  conventional  and 
aggregated.  The  linear-nominated  analyses  made  clearer  what  the 
naive  analyses  suggested,  namely  that  the  essential  description  was  in 
terms  of  a  few  linear-to-the-/  slopes,  a  bending  for  low  rates,  and  an 
apparently  unstructured  mass  of  residuals. 

In  Part  B  we  considered  election  because  of  behavior  in  the  data 
set  before  us,  in  contrast  to  nomination  based  on  past  experience.  We 
found  that  much  the  same  interpretations  were  suggested  in  our 
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particular  example  by  the  results  of  election  as  by  nomination,  with 
the  three  largest  of  what  would  otherwise  be  the  nominated 
contrasts,  R  1,  W 1,  and  DW11  being  elected  as  important.  We  also 
considered  the  nominated  bouquet  consisting  of  all  7  linear-to-the-/ 
contrasts  and  found  that  the  rate  slope  R 1  could  reasonably  be 
super-elected  as  the  most  important.  By  the  end  of  Part  B,  we  had 
come  to  the  conclusion  that,  as  far  as  the  response  of  IB1  in  grams 
was  concerned,  much  of  the  relationship  between  the  response  and 
rate,  weight  and  date  could  be  adequately  described  by  the  six  largest 
linear-to-the-/  nominated  contrasts:  R  1,  W 1,  DW11,  DRW111,  RW11 
and  D 1,  in  that  order,  where  the  linear  rate  slope  is  by  far  the  most 
important. 

In  Part  C  we  considered  the  possibility  of  reformulating  the 
response  and  found  that,  since  the  response  in  grams  was 
approximately  proportional  to  rate,  reformulating  the  response  to  log 
response  time  greatly  reduced  the  apparent  importance  of  the  linear 
rate  contrast  and  allowed  other  relationships  within  the  data  to 
become  more  apparent.  In  particular  we  discovered  the  relatively 
high  levels  of  the  display  ratios  for  the  trimmed  rate  bouquet  and 
concluded  that  a  systematic  relationship  between  rate  and  the 
response  was  being  spread  across  the  polynomial  contrasts. 

In  Part  D  we  rethought  the  scission  of  rate  into  contrasts.  After 
considering  various  bouquets  which  emphasized  the  ordering  of  the 
levels  of  rate,  we  concluded  that  the  essential  relationship  between 
the  response  and  rate  was  that  person  IB1  has  a  larger  response  for 
the  smallest  rate  than  the  responses  for  the  larger  3  rates,  all  of  which 
are  at  the  same  level.  By  adopting  the  Helmert  SFP  scission  of  rate 
we  found  the  apparent  importance  of  all  interactions  involving  rate 
vanished. 

Lastly,  we  have  considered  the  utility  of  refactoring  of  the  pattern 
of  analysis.  We  had  already  split  the  analysis  by  person  based  on  the 
maxim  that  people  are  different.  When  we  further  split  the  analysis 
for  person  IB1  on  date,  we  discovered  that  a  strong  slope  in  weight 
for  day  2  predominated.  Finally,  we  split  over  date  and  used  the 
Helmert  contrasts  as  our  scission  of  rate.  This  eliminated  the  rate- 
by-weight  interactions  as  potentially  important. 

At  the  end  of  our  analysis  of  the  response  of  person  IB1  in  log 
seconds,  we  are  left  with  the  following  account  of  the  data. 

•  a  strong  linear  relationship  between  response  and  weight 
within  day  2,  with  the  response  decreasing  in  weight.  A  much 
weaker,  but  still  notable,  linear  relationship  of  the  same 
direction  between  response  ana  weight  within  day  1, 
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•  a  tendency  for  the  responses  at  the  lowest  rate  to  be 
significantly  higher  than  the  responses  at  the  other  three  rates, 
all  three  of  which  have  responses  at  much  the  same  level.  This 
relationship  is  the  strongest  for  day  2  but  is  still  notable  at  day  1, 

•  a  collection  of  very  much  smaller  effects,  interactions,  and  noise. 


PART  E:  A  LOOK  AT  THE  OTHER  7  PEOPLE 

El.  All  16  of  the  Person-Date  Units 

We  have  spent  considerable  effort  on  eccentric  IB1.  What  of  the 
other  7  people?  With  these  choices. 

response  —  log  response  time 

no  aggregation 

nomination  -  all  linear-to-the-/ 
a  nominated  bouquet 

the  horizontalized  plots  generally  show  relatively  nice  behavior.  For 
5  of  the  7  people,  the  plots  of  the  trimmed  bouquets  are  relatively  flat 
(except,  sometimes,  for  the  smallest  contrasts)  and  all  at  roughly  the 
same  level.  For  these  5  people,  the  nominated  contrasts  contain  the 
bulk  of  the  information  in  the  experiment,  with  little  remaining  in 
the  trimmed  bouquets  besides  background  noise. 

Two  of  the  7  people  deserve  specific  attention:  persons  IB2  and 
IIA2.  The  horizontalized  plots  for  these  persons  are  shown  as  Figures 
15  and  16. 

Notice  that  in  both  of  these  figures  the  plots  of  the  trimmed  rate 
and  trimmed  date-by-rate  bouquet  appear  at  relatively  high  levels  in 
relation  to  the  levels  of  the  bulk  of  the  other  trimmed  bouquets. 
Furthermore,  for  each  person  and  for  each  of  the  trimmed  rate  and 
trimmed  date-by-rate  bouquets,  the  slope  of  the  line  connecting  the 
display  ratios  of  the  two  contrasts  within  the  bouquet  is  negative, 
indicating  that  the  smallest  contrast  in  the  bouquet  is  relatively  larger 
than  might  be  expected.  The  form  of  these  plots  is  symptomatic  of 
the  spreading  of  a  systematic  relationship  across  the  contrasts  in  a 
bouquet. 


Figure  15.  Person  IB2  —  log  response  time  in  Ln  seconds.  Polynomial  contrasts 
pretrimmed  bouquets. 
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Figure  17  shows  the  type  of  spreading  for  each  of  the  two  people. 
In  the  plot  of  log  response  time  versus  rate  within  date  for  person  IB2 
shown  as  the  first  panel  of  Figure  17,  we  can  see  that  the  main 
systematic  relationship  that  is  not  being  well  represented  by 
individual  polynomial  contrasts  is  the  large  deviation  of  his  response 
on  date  2  to  a  rate  of  150  gm/30  seconds  from  the  line  through  the 
responses  for  the  other  three  rates  (for  that  date).  This  is  a  third- 
point-off  deviation  from  linearity.  In  the  second  panel  of  Figure  17, 
there  is  a  second-point-off  deviation  from  linearity  in  the  responses  of 
person  IIA2  on  date  2.  The  reasons  for  these  patterns  of  response  for 
these  two  people,  and  the  equivalent  patterns  for  person  IBl 
(Figure  10),  are  unknown  but  demonstrate  the  maxim  that  people  will 
be  different.  (We  might  be  suspicious  of  the  randomization.) 

Figure  15  also  shows  a  pattern  in  the  display  ratios  for  the 
trimmed  weight  bouquet.  The  monotonically  decreasing  nature  of 
this  plot  is  also  indicative  of  spreading,  of  a  less  obvious  kind,  as 
indicated  by  Figure  18.  The  spreadings  within  the  rate  and  weight 
bouquets  are  at  least  partly  responsible  for  the  high  display  ratios 
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Figure  17.  Log  response  time  vs  rate  by  date  •-  persons  IB2  and  I1A2. 
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seen  in  Figure  15  for  the  three  smallest  rate-by-weight  contrasts 
(RW25,  RW23  and  RW 16  in  ascending  order  of  size  of  contrast; 
descending  order  in  display  ratio). 

Turning  to  the  nominated  bouquets  for  persons  IB2  and  IIA2,  we 
note  that  the  ordering  of  the  contrasts  is  different  between  these  tw'o 
people  and  different  from  that  of  person  IB1.  The  order  of  contrasts 
within  the  nominated  bouquet  for  person  IB2  is,  from  largest  to 
smallest,  Wl,  Dl,  Rl,  RW11,  DW  11,  DR  11  and  DRW  111,  where  Wl 
would  assuredly  be  elected,  D 1  and  R 1  might  also  be,  and  the 
remaining  4  contrasts,  all  interactions,  are  at  the  background  noise 
level.  For  person  IIA2,  the  order  of  contrasts  within  the  nominated 
bouquet  is,  from  largest  to  smallest,  Dl,  R 1,  DR  11,  Wl,  DW11, 
RW 11,  and  DRW  111,  of  which  only  D 1  would  be  elected. 


Figure  18.  Person  IB2:  log  response  time  vs  weight  separated  by  date. 
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Let  Us  Split  Again 

These  analyses  have  split  over  person.  If  we  further  split  over 
date  and  adopt  a  representation  analogous  to  that  used  in  section  B5, 
making  the  various  slope  coefficients  (and  the  centercept-multiplied- 
by-V28 )  equal  to  plus  or  minus  the  square  root  of  the  corresponding 
mean  square,  we  get  the  results  in  Table  25.  Also  included  in  the 
table  is  the  median  of  the  24  display  ratios  of  the  three  trimmed 
bouquets  within  each  date. 

The  major  point  to  notice  in  Table  25  is  that  the  relationships 
between  response  and  the  linear-to-the-/  contrasts,  split  on  date,  tend 
to  differ  from  person  to  person,  and  this  is  true  even  when  we 
compare  the  two  people  of  a  specified  sex  and  sight  combination. 

Any  thoughts  we  have  about  the  further  analysis  of  Table  25  will 
have  to  wait  for  another  day.  If  we  go  to  working  values  of  the 
square  root  of  chi-square  as  divisors  for  the  16  observed  median 
display  ratios,  we  find  that  the  picture  looks  best  near  5  to  6  df  in  the 
chi-square.  Clearly  there  is  more  variability  here  than  we  would 
expect  for  medians  of  24  display  ratios.  We  leave  this,  too,  to  another 
time,  noting  only  the  large  apparent  effect  of  sex  x  sight. 

E2.  Reassembly 

Let  us  suppose,  for  simplicity,  that  for  each  of  the  8  persons,  or, 
perhaps,  for  each  of  the  16  person-date  combinations,  we  have  an 
analysis  whose  main  constituents  are: 

1)  a  few  fitted  constants,  say  of  the  linear-to-the-/  form  for 
/  =  1,2  or  3  (for  persons)  or  j  -  1,  2  (for  person-date 
combinations); 

2)  an  unresolved  mish-mash  of  residuals. 

We  understand  how  to  look  rather  effectively  at  the  sets  of  8  or  16 
individual  responses  under  (1),  or  even,  perhaps,  at  the  8  or  16 
collective  spreads  extractable  from  (2),  but  how  should  we  look  at  the 
8  or  16  mish-mashes? 


Better  Matching 

If  we  knew  enough  about  the  order  of  presentation  of  the 
treatments,  which  was  once  known,  and  if,  also,  the  effect  of  order  of 
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presentation  was  somewhat  consistent,  across  persons  or  across 
person-date  combinations,  we  could  match  order  of  presentation 
across  subjects  (or  across  subject-date  combinations)  and  see  what 
could  be  extracted.  It  is  probably  reasonable  to  presume  that  such 
matching  would  be  more  effective  than  the  only  matching  we  can 
still  try,  namely  that  in  terms  of  rate-weight-date  or  rate-weight 
combinations. 

In  either  case,  we  can  think  of  a  two-way  table  of  residuals,  8-by- 
56  or  16-by-28,  where  we  seek  to  find  some  matching  across  persons, 
or  across  person-dates,  but  we  do  not  want  to  —  or  dare  not  — 
assume  that  this  matching  runs  right  across  all  8  persons  or  all  16 
(person-date)  combinations.  These  tables  ought,  it  would  seem,  to  be 
rescaled  (for  person  or  combination)  before  we  tackle  them.  We  need 
some  form  of  sub-factorial  analysis,  since  a  plain  factorial  will  not 
always  work. 

A  first  thing  to  do  in  such  a  situation  would  be  to  do  a  resistant 
row-PLUS-column  analysis,  say  by  median  polish.  If  only  a  few 
persons,  or  a  few  combinations,  escape  some  consistent  pattern,  such 
an  analysis  would  tend  to  find  the  pattern. 

If  this  fails,  it  seems  natural  to  begin  by  dividing  the  8  persons  or 
16  combinations  into  two  or  three  clusters,  and  then  repeating  the 
first  step  for  the  smaller  tables  that  result. 


Delineations  against  Single  Splits 

Another,  quite  different  approach,  would  be  to  select  one  split  and 
plot  the  composite  of  the  other  splits  against  it,  delineating  (Tukey, 
1977)  the  scatter.  If  the  whole  Johnson  and  Tsao  data  were  split  into 
8  portions,  one  per  subject,  with  a  2x4x7  table  of  working 
residuals  for  each,  this  would  mean  plotting  7  x  56  —  392  residuals 
for  other  subjects  against  the  56  values  of  the  selected  subject. 
Enhanced  by  delineation,  such  a  plot  has  a  real  hope  of  detecting 
commonality  between  the  selected  subject  and  a  few  of  the  other 
subjects.  Eight  such  plots  would  not  be  too  many  to  look  at. 

Bear  in  mind  that  the  purpose  of  the  present  section  is  only  to 
indicate  that  sub-factorial  analyses  are  possible,  and  to  stress  that  we 
need  a  body  of  experience  in  their  use  and  modification. 
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One  blind  male  and  one  sighted  male  were  21,  the  other  two  males  were  25. 
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PART  F:  SUMMARY 

The  overall  thrusts  of  this  account  are. 

1)  That  simple  graphical  views  of  what  can  easily  be 
calculated  from  factorial  data  can  guide  us  in  structuring  a 
useful  description,  one  that: 

la)  gives  detail,  where  that  is  useful, 

lb)  avoids  detail,  where  detail  would  clog  our 
perception, 

lc)  alters  such  parts  of  the  initially  concerned  expression, 
formulation,  and  factoring  as  need  to  be  changed  for 
increased  simplicity  of  description. 

2)  That  scission  of  “lines"  of  an  analysis  of  variance  into 
bouquets  of  contrasts  is  a  useful  tool  in  doing  this,  noting 
that. 

2a)  looking  at  alternative  scissions  can  help,  either  by 
making  visible  something  that  deserves  attention,  or 
by  increasing  our  confidence  that  we  were  not 
missing  anything  visible, 

2b)  the  choice  of  the  basic  scissions  should  be  responsive 
to  the  nature  of  the  corresponding  factor:  measured 
ordered,  weakly  structured,  or  unstructured, 

2c)  the  use  of  product  scissions  for  interaction  lines  is  the 
best  way  we  now  know  in  which  to  begin  (splitting 
or  other  refactoring  may  be  urged  on  us  by  the 
results  of  the  initial  analysis), 

2d)  for  measured  factors,  the  use  of  the  conventional 
linear  contrast  as  one  of  each  of  our  bouquets  seems 
desirable, 

2e)  the  most  promising  default  may  be  to  combine  this 
contrast  with  the  EPO  contrasts  (although 
combination  with  Helmert  contrasts  worked  best  for 
the  rate  factor  in  our  example). 

3)  That  modifications  and  enhancements  of  Daniel's  classical 
half-normal  plot,  combined  with  (2),  provide  an  effective 
way  to  do  (1),  noting  that  we  can  make  our  pictures  easier 
to  interpret  and  understand  by: 

3a)  "horizontalizing"  the  plot  by  plotting  "display  ratio 
against  working  value. 
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3b)  making  separate,  sometimes  superimposed  plots  for 
different  lines  in  the  corresponding  analysis  of 
variance, 

3c)  separating  lines,  for  the  purpose  of  (3b),  into  as 
meaningful  groups  as  we  can  find. 

4)  That  enhancing  the  basic  analysis  into  lines,  by 
nominating  in  advance  —  or  electing  because  of  the  data's 
behavior  —  certain  contrasts  to  become  separate  "lines"  in 
an  enhanced  or  revised  analysis  can  be  very  important, 
noting  that: 

4a)  nomination  is  "safer"  than  election,  and  deserves  our 
careful  attention, 

4b)  the  loss  from  uncalled-for  nomination  is  small,  since 
our  procedures  will  often  lead  to  reabsorption, 

4c)  the  gain  from  needed  nomination  can  be  great,  both 
in  focussing  more  attention  on  the  nominated 
contrast,  when  such  focussing  is  needed,  and  in 
avoiding  inappropriate  dragging  upward  of  the 
display  ratios  for  other  contrasts. 

5)  That  a  free  hand  in  rescission,  re-expression, 

reformulation,  and  refactoring  can  be  of  great  assistance  in 
such  analysis,  even  though  undeserved  use  of  such 
freedom  may  produce,  with  increased  frequency,  simple 
descriptions  of  one  data  set  that  later  prove  not  to  extend 
to  others,  since: 

5a)  to  trust  in  the  establishment  of  either  qualitative  (e  g 
structures)  or  quantitative  (e.g.  slopes)  behavior  by 
the  analysis  of  one  data  set  collected  under  one  set  of 
conditions  is  very  poor  science, 

5b)  the  gain  in  understanding  which  usually 

accompanies  a  simpler  description  is  so  frequently 
helpful  in  the  later  use  of  the  results,  even  when  it 
reflects  accidental  serendipity. 

We  believe  all  of  these  points  apply  to  the  analysis  of  most 
factorial  data  sets  with  3  or  more  factors  (for  the  case  of  several 
factors  at  two  levels  see  Seheult  and  Tukey  1982),  and,  as  well,  to  a 
majority  of  other  instances  of  analysis  of  variance  of  a  similar  size 
and  complexity. 
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